System and method for classification of objects of a computer system

ABSTRACT

Methods and systems are described in the present disclosure for classifying malicious objects. In an exemplary aspect, a method includes: collecting data describing a state of an object of the computer system, forming a vector of features, calculating a degree of similarity based on the vector, calculating a limit degree of difference that is a numerical value characterizing the probability that the object being classified will certainly belong to another class, forming a criterion for determination of class of the object based on the degree of similarity and the limit degree of difference, determining that the object belongs to the determined class when the data satisfies the criterion, wherein the data is collected over a period of time defined by a data collection rule and pronouncing the object as malicious when it is determined that the object belongs to the specified class.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims benefit of priority under 35 U.S.C. 119(a)-(d)to Russian Patent Application No. 2018147230 filed Dec. 28, 2018 whichis incorporated by reference herein.

FIELD OF TECHNOLOGY

The present disclosure relates to data analysis technologies,specifically to systems and methods of classification of objects of acomputer system.

BACKGROUND

The rapid growth of computer technologies in the past decade, and alsothe widespread use of different types of computing devices (personalcomputers, notebooks, tablets, smartphones, and so on), has stronglyinfluenced the use of those devices in diverse areas of activity for alarge number of tasks (from Internet surfing to bank transfers andelectronic document traffic). In parallel with the growth in the numberof computing devices and the software running on these devices, thenumber of malicious programs has also grown at a rapid pace.

A large variety of malicious programs exist at present, some that stealpersonal and confidential data from the devices of users (such as loginsand passwords, bank details, electronic documents). Other maliciousprograms form “botnets” from the devices of users for attacks such as aDDoS (Distributed Denial of Service) or for sorting through passwordsusing brute force against other computers or computer networks. Stillother malicious programs propose paid content to users through intrusiveadvertising, paid subscriptions, sending of SMS to toll numbers, and soforth.

Specialized programs known as antivirus programs are used to deal withmalicious programs, including detection of the malicious programs,prevention of infection, and restoration of the working capacity of thecomputing devices infected with malicious programs. Antivirus programsemploy various technologies to detect the full variety of maliciousprograms, such as:

-   -   a) static analysis—analysis of programs for harmfulness,        including running or emulating the programs being analyzed,        based on data contained in files constituting the programs being        analyzed, whereby it is possible to use during statistical        analysis:    -   b) signature analysis—searching for correspondences of a        particular segment of code of the programs being analyzed to        known code signatures from a database of signatures of malicious        programs;    -   c) white and black lists—a search for calculated check sums of        the programs being analyzed (or portions thereof) in a database        of check sums of malicious programs (black lists) or a database        of check sums of safe programs (white lists);    -   d) dynamic analysis—analysis of programs for harmfulness based        on data obtained in the course of execution or emulation of the        programs being analyzed, whereby it is possible to use during        dynamic analysis:    -   e) heuristic analysis—emulation of the programs being analyzed,        the creating of emulation logs (containing data on the calls of        API functions, the parameters transmitted, the code segments of        the programs being analyzed, and so on) and the search for        correspondences between the data of the logs created and the        data from a database of behavioral signatures of malicious        programs;    -   f) proactive protection—intercepting the calls of API functions        of the launched programs being analyzed, creating logs of the        behavior of the programs being analyzed (containing data on the        calls of API functions, the parameters transmitted, the code        segments of the programs being analyzed, and so on) and        searching for correspondences between the data of the logs        created and the data from a database of calls of malicious        programs.

Both static and dynamic analysis have their advantages anddisadvantages. Static analysis is less demanding of resources of thecomputing device on which the analysis is being performed. Further,since static analysis does not require the execution or the emulation ofthe program being analyzed, static analysis is faster, but at the sametime less effective than dynamic analysis. In other words, staticanalysis often has a lower percentage of detection of malicious programsand a higher percentage of false alarms (i.e., pronouncing a verdictthat a file analyzed by the means of the antivirus program is malicious,even though it is safe) than dynamic analysis. Dynamic analysis isslower because it uses data obtained during the execution or emulationof the program being analyzed, and dynamic analysis places higherdemands on the resources of the computing device on which the analysisis being performed, but it is also more effective. Modern antivirusprograms employ a comprehensive analysis, including elements of bothstatic and dynamic analysis.

Since modern standards of computer security require an operativeresponse to malicious programs (especially to new malicious programs),automatic detection of malicious programs are the primary focus ofattention. For the effective operation of automatic detection, elementsof artificial intelligence and various methods of machine learning ofmodels are often used for the detection of malicious programs (i.e.,sets of rules for decision making as to the harmfulness of a file on thebasis of a certain set of input data describing the malicious file).This enables an effective detection of not only well-known maliciousprograms or malicious programs with well-known malicious behavior, butalso new malicious programs having unknown or little studied maliciousbehavior, as well as an operative adaptation (learning) to detect newmalicious programs

The present disclosure makes it possible to solve the problem ofdetecting of malicious files.

SUMMARY

The disclosure is directed towards the classification of objects of acomputer system in order to determine whether the objects are malicious.

One technical result of the present disclosure includes increasing theaccuracy of classification of objects of a computer system by the use oftwo stages of evaluation of the classes to which the objects of thecomputer system belong.

An exemplary method for detecting malicious objects on a computersystem, comprises collecting data describing a state of an object of thecomputer system, forming a vector of features characterizing the stateof the object, calculating a degree of similarity based on the formedvector of features, wherein the degree of similarity is a numericalvalue characterizing the probability that the object being classifiedmay belong to a given class, calculating a limit degree of differencethat is a numerical value characterizing the probability that the objectbeing classified will certainly belong to another class, forming acriterion for determination of class of the object based on the degreeof similarity and the limit degree of difference, determining that theobject belongs to the determined class when the data satisfies thecriterion, wherein the data is collected over a period of time definedby a data collection rule and pronouncing the object as malicious whenit is determined that the object belongs to the specified class.

In one aspect, the criterion is a rule for the classification of theobject by an established correlation between the degree of similarityand the limit degree of difference.

In one aspect, the correlation between the degree of similarity and thelimit degree of difference is one or more of: a difference in distancebetween the degree of similarity and the limit degree of difference froma predetermined threshold value; a difference in the area bounded in agiven time interval between the degree of similarity and the limitdegree of difference from a predetermined threshold value; and adifference in the rate of mutual growth of the curve describing thechange in the degree of harmfulness and the limit degree of differencefrom a predetermined value.

In one aspect, the vector of features is a convolution of collected dataorganized in the form of a set of numbers.

In one aspect, the data collection rule is one of: an interval of timebetween different states of the object satisfies a predetermined value,and a change in a parameter of the computer system resulting in a changein state of the object satisfies a given value.

In one aspect, the limit degree of difference being calculated dependson the degree of similarity and wherein the limit degree of differenceis calculated one of: at the instant of creating the object, at theinstant of a first change in state of the object, and based on analysisof static parameters of the object.

In one aspect, if in the course of the period defined by the datacollection rule at least two degrees of similarity and limit degrees ofdifference have been calculated, a set of consecutively calculateddegrees of similarity and limit degrees of difference is described by apredetermined time law.

In one aspect, the time laws describing the consecutively calculateddegrees of similarity and the consecutively calculated limit degrees ofdifference are monotonic.

An exemplary system described herein comprises a hardware processorconfigured to: collect data describing a state of an object of thecomputer system, form a vector of features characterizing the state ofthe object, calculate a degree of similarity based on the formed vectorof features, wherein the degree of similarity is a numerical valuecharacterizing the probability that the object being classified maybelong to a given class, calculate a limit degree of difference that isa numerical value characterizing the probability that the object beingclassified will certainly belong to another class, form a criterion fordetermination of class of the object based on the degree of similarityand the limit degree of difference, determine that the object belongs tothe determined class when the data satisfies the criterion, wherein thedata is collected over a period of time defined by a data collectionrule and pronounce the object as malicious when it is determined thatthe object belongs to the specified class.

The above simplified summary of example aspects serves to provide abasic understanding of the present disclosure. This summary is not anextensive overview of all contemplated aspects, and is intended toneither identify key or critical elements of all aspects nor delineatethe scope of any or all aspects of the present disclosure. Its solepurpose is to present one or more aspects in a simplified form as aprelude to the more detailed description of the disclosure that follows.To the accomplishment of the foregoing, the one or more aspects of thepresent disclosure include the features described and exemplary pointedout in the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a structural diagram of a system 100 for machine learning of amodel for detection of malicious files, in accordance with exemplaryaspects of the present disclosure.

FIG. 2 is a flow diagram of a method for machine learning of a model fordetection of malicious files, in accordance with exemplary aspects ofthe present disclosure.

FIG. 3 shows examples of the dynamics of change in the degree ofharmfulness as a function of the number of behavior patterns, inaccordance with exemplary aspects of the present disclosure.

FIG. 4 shows an example of a diagram of relations between elements ofbehavior patterns, in accordance with exemplary aspects of the presentdisclosure.

FIG. 5 is a structural diagram of a system for detection of maliciousfiles with the use of a trained model of detection of malicious files,in accordance with exemplary aspects of the present disclosure.

FIG. 6 is a flow diagram of a method for detection of malicious fileswith the use of a trained model of detection of malicious files, inaccordance with exemplary aspects of the present disclosure.

FIG. 7 is a structural diagram of a system for detection of a maliciousfile, in accordance with exemplary aspects of the present disclosure.

FIG. 8 is a flow diagram of a method for detection of a malicious file,in accordance with exemplary aspects of the present disclosure.

FIG. 9 shows examples of the dynamics of change in the degree ofharmfulness and the limit degree of security as a function of the numberof behavior patterns, in accordance with exemplary aspects of thepresent disclosure.

FIG. 10 is a structural diagram of a system for classification ofobjects of a computer system, in accordance with exemplary aspects ofthe present disclosure.

FIG. 11 is a flow diagram of a method for classification of objects of acomputer system, in accordance with exemplary aspects of the presentdisclosure.

FIG. 12 illustrates an example of a general-purpose computer system, apersonal computer or a server, in accordance with exemplary aspects ofthe present disclosure.

DETAILED DESCRIPTION

The disclosed system and method are directed to classifying objects on acomputer system as malicious or safe, in accordance with exemplaryaspects of the present disclosure. Example aspects are described hereinin the context of a system, method and computer program product forclassifying objects on a computer system as malicious or safe. Those ofordinary skill in the art will realize that the following description isillustrative only and is not intended to be in any way limiting. Otheraspects will readily suggest themselves to those skilled in the arthaving the benefit of this disclosure. Reference will now be made indetail to implementations of the example aspects as illustrated in theaccompanying drawings. The same reference indicators will be used to theextent possible throughout the drawings and the following description torefer to the same or like items.

The following definitions are used throughout the disclosure to describethe various aspects. aspect

Malicious file—a file whose execution is known to result in unauthorizeddestruction, blocking, modification or copying of computer informationor neutralization of computer information protection systems.

Malicious behavior of an executable file—a group of actions that may beperformed during execution of that file and that are known to be able toresult in unauthorized destruction, blocking, modification or copying ofcomputer information or neutralization of computer informationprotection systems.

Malicious activity of an executable file—a group of actions performed bythat file in accordance with its malicious behavior.

Computing device of the average user—a hypothetical (theoretical)computing device, having average characteristics of the computingdevices of a previously selected group of users, on which the sameapplications are executed as on the computing devices of those users.

Command executable by a computing device—a set of machine instructionsor instructions of scripts executable by a computing device on the basisof the parameters of those instructions, known as command parameters orparameters describing said command.

Lexical analysis (tokenizing)—a process of analytical parsing of aninput sequence of characters into recognized groups (hereafter:lexemes), in order to form identification sequences at the output(hereafter: tokens).

Token—an identification sequence formed from a lexeme in the process oflexical analysis.

FIG. 1 is a structural diagram of a system 100 for machine learning of amodel for detection of malicious files, in accordance with exemplaryaspects of the present disclosure.

The structural diagram of the system 100 for machine learning consistsof a training selection preparation module 111, a behavior log formingmodule 112, a behavior pattern forming module 121, a convolutionfunction forming module 122, a detection model creating module 131, anda detection model machine learning module 132.

In one variant aspect, the system 100 has a client-server architecture,in which the training selection preparation module 111, the behavior logforming module 112, the behavior pattern forming module 121, theconvolution function forming module 122, the detection model creatingmodule 131, and the detection model machine learning module 132 work atthe server side, and the behavior pattern forming module 121.

For example, the client may be the computing devices of a user, such asa personal computer, a notebook, a smartphone, and so forth. The servermay be the computing devices of an antivirus company, such asdistributed systems of servers that perform at least a preliminarycollection and antivirus analysis of files, a creation of antivirusrecords, and so forth. The system 100 is used in some aspects to detectmalicious files at the client side, thereby enhancing the effectivenessof the antivirus protection of that client.

In yet another example, both the client and the server may be thecomputing devices of the antivirus company alone, wherein the system 100may be used for automated antivirus analysis of files and creation ofantivirus records, thereby enhancing the working effectiveness of theantivirus company.

In exemplary aspects, the training selection preparation module 111 isconfigured to:

-   -   select at least one file from a database of files 113 in        accordance with predetermined learning rules of forming a        learning selection of files, after which the detection model        machine learning module 132 will carry out the teaching of the        model of detection on the basis of an analysis of the selected        files;    -   send the selected files to the behavior log forming module 112.

In one variant aspect of the system 100, at least one safe file and onemalicious file are kept in the database of files 113.

For example, the database of files 113 may keep, as safe files, thefiles of the operating system Windows, and as malicious files the filesof backdoors, applications carrying out unauthorized access to data andremote control of an operating system and a computer as a whole. Bytraining with the mentioned files and using methods of machine learning,the model for detection of malicious files will be able to detectmalicious files having a functionality similar to the functionality ofthe aforementioned backdoors with high accuracy (the higher the accuracythe more files are used for the teaching of the aforementioned model ofdetection).

In yet another variant aspect of the system, the database of files 113additionally keeps at least:

suspicious files (riskware)—files which are not malicious, yet are ableto carry out malicious actions;

unknown files—files whose harmfulness has not been determined andremains unknown (i.e., files which are not safe, malicious, suspicious,and so forth).

For example, the database of files 113 may have, as suspicious files,the files of applications for remote administration (such as RAdmin),archiving, or data encryption (such as WinZip), and so on.

In yet another variant aspect of the system, the database of files 113keeps at least files:

-   -   collected by antivirus web crawlers;    -   sent in by users.    -   The mentioned files are analyzed by antivirus experts, including        with the help of automatic means of file analysis, in order to        then pronounce a verdict as to the harmfulness of such files.

For example, the database of files 113 may store files that were sent inby users from their computing devices to the antivirus companies forchecking their harmfulness. The files transmitted may be either safe ormalicious, and the distribution between the number of said safe andmalicious files is close to the distribution between the number of allsafe and malicious files located on the computing devices of said users(i.e., the ratio of the number of said safe files to the number of saidmalicious files differs from the ratio of the number of all safe filesto the number of all malicious files located on the computing devices ofsaid users by a quantity less than a specified threshold value:

$\left. {{{\frac{N_{clean}}{N_{malware}} - \frac{\forall N_{clean}}{\forall N_{malware}}}} < ɛ} \right).$

Unlike the files sent in by the users (i.e., files which aresubjectively suspicious), the files collected by antivirus web crawlersthat are designed to search for suspicious and malicious files moreoften prove to be malicious.

In yet another variant aspect of the system 100, at least one of thefollowing conditions is used as the criteria for selecting files fromthe database of files 113:

-   -   the distribution between safe and malicious files selected from        the database of files 113 corresponds to the distribution        between safe and malicious files located on the computing device        of the average user;    -   the distribution between safe and malicious files selected from        the database of files 113 corresponds to the distribution        between safe and malicious files collected with the help of        antivirus web crawlers;    -   the parameters of the files selected from the database of files        113 correspond to the parameters of the files located on the        computing device of the average user;    -   the number of selected files corresponds to a predetermined        value, while the files themselves are selected at random.

For example, the database of files 113 contains 100,000 files, amongwhich 40% are safe files and 60% are malicious files. 150,000 files (15%of the total number of files being kept in the database of files 113)are selected from the database of files 113 such that the distributionbetween the selected safe and malicious files corresponds to thedistribution between the safe and the malicious files located on thecomputing device of the average user, amounting to 95 safe files forevery 5 malicious files. For this purpose, 14,250 safe files (35.63% ofthe total number of safe files) and 750 malicious files (1.25% of thetotal number of malicious files) are chosen at random from the databaseof files 113.

In yet another example, the database of files 113 contains 1,250,000files, of which 95% are safe files and 5% are malicious files. Thus, thedistribution between safe and malicious files being kept in the databaseof files 113 corresponds to the distribution between the safe and themalicious files located on the computing device of the average user. Ofthese files, 5,000 files are chosen at random, approximately 4,750 ofwhich prove to be safe files and approximately 250 prove to be maliciousfiles, with a high probability.

In yet another variant aspect of the system, the file parameters are atleast:

-   -   the harmfulness of the file, characterizing whether the file is        safe, malicious, potentially dangerous, or the behavior of the        computer system when executing the file is not determined, and        so forth;    -   the number of commands performed by the computing device during        the execution of the file;    -   the size of the file;    -   the applications utilizing the file.

For example, files that contain scripts in the “ActionScript” language,executable by the application “Adobe Flash”, and not exceeding 5 kB insize, are chosen from the database of files 113 as malicious.

In yet another variant aspect of the system, the training selectionpreparation module 111 is additionally designed to:

-   -   select at least one other file from the database of files 113 in        accordance with predetermined rules of forming a test selection        of files, after which the detection model machine learning        module 132 will carry out a verification of the trained model of        detection on the basis of an analysis of the selected files;    -   send the selected files to the behavior log forming module 112.

In another example, the database of files 113 may contain 75,000 files,20% of which are safe files and 80% of which are malicious files.initially, 12,500 files are chosen from the database of files 113, 30%of which are safe files and 70% of which are malicious files.Subsequently, the detection model machine learning module 132 teachesthe detection model 133 on the basis of an analysis of the selectedfiles. After this step, 2,500 files are selected from the remaining62500 files, of which 60% are safe files and 40% are malicious files.and after this the detection model machine learning module 132 willcheck the trained detection model 133 based on analysis of the selectedfiles. The data formulated in the above described manner is referred toas a cross-validation set of data.

In one aspect, the behavior log forming module 112 is configured to:

-   -   intercept at least one executable command at least during:        -   the execution of the file received,        -   the emulation of the execution of the file received, wherein            the emulation of the execution of the file includes the            opening of the mentioned file (for example, the opening of a            script by an interpreter);    -   determine for each intercepted command at least one parameter        describing that command;    -   form a behavior log 115 of the obtained file on the basis of the        intercepted commands and the parameters so determined, wherein        the behavior log constitutes the totality of intercepted        commands (hereinafter, the command) from the file, where each        command corresponds at least to one parameter so determined and        describing that command (hereinafter, the parameter).

For example, the following is an example of commands intercepted duringthe execution of a malicious file that collects passwords and transmitsthem via a computer network, and the parameters calculated for saidcommands:

-   -   CreateFile, ‘c:\windows\system32\data.pass’    -   ReadFile, 0x14ea25f7, 0xf000    -   connect, http://stealpass.com    -   send, 0x14ea25f7, 0xf000    -   In one variant aspect of the system 100, the intercepting of        commands from the file is done with the aid of at least:    -   a specialized driver;    -   a debugger;    -   a hypervisor.

For example, the intercepting of commands during the execution of thefile and the determination of their parameters is performed using adriver that utilizes an interception by splicing of the entry point of aWinAPI function.

In yet another example, intercepting commands during emulation of theexecution of a file is performed directly by emulation software orhardware that determines the parameters of the command to be emulated.

In yet another example, intercepting commands during execution of thefile on a virtual machine is performed by means of a hypervisor thatdetermines the parameters of the command to be emulated.

In yet another variant aspect of the system, the intercepted commandsfrom the file include at least:

-   -   API functions;    -   sets of machine instructions describing a predetermined set of        actions (macro commands).    -   For example, malicious programs very often perform a search for        certain files and modify their attributes, for which they employ        a sequence of commands such as:    -   FindFirstFile, ‘c:\windows\system32\*.pass’, 0x40afb86a    -   SetFileAttributes, ‘c:\windows\system32\data.pass’    -   FindNextFile, 0x40afb86a    -   CloseHandle, 0x40afb86a    -   which may in turn be described by only a single command    -   _change_attributes, ‘c:\windows\system32\*.pass’

In yet another variant aspect of the system, each command is matched upwith a unique identifier.

For example, all WinAPI functions may be matched up with numbers in therange of 0x0000 to 0x8000, where each WinAPI function corresponds to aunique number (for example, ReadFile→0x0010, ReadFileEx→0x0011,connect→0x03A2).

In yet another variant aspect of the system, several commands describingsimilar actions are matched up with a single identifier.

For example, all commands such as ReadFile, ReadFileEx, ifstream,getline, getchar and so forth, which describe a reading of data from afile, are matched up with an identifier_read_data_file (0x70F0).

In one aspect, the behavior pattern forming module 121 is configured to:

-   -   form at least one behavior pattern on the basis of the commands        and parameters selected from the behavior log, wherein the        behavior log constitutes the totality of executable commands        (hereinafter, the command) from the file, where each command        corresponds at least to one parameter describing that command        (hereinafter, the parameter), the behavior pattern being a set        of at least one command and such a parameter, which describes        all of the commands of that set (hereinafter, the elements of        the behavior pattern);    -   send the behavior patterns so formed to the convolution function        forming module 122;

For example, from the behavior log the following commands c₁ andparameters p_(i) are selected:

-   -   {c₁, p₁, p₂, p₃},    -   {c₂, p₁, p₄},    -   {c₃, p₅},    -   {c₂, p₅},    -   {c₁, p₅, p₆},    -   {c₃, p₂}.

On the basis of the selected commands and parameters, behavior patternsare formed each containing one command and one parameter describing thatcommand:

-   -   {c₁, p₁}, {c₁, p₂}, {c₁, p₃}, {c₁, p₅}, {c₁, p₆}    -   {c₂, p₁}, {c₂, p₄}, {c₂, p₅},    -   {c₃, p₂}, {c₃, p₅}.    -   Next, on the basis of the patterns so formed, behavior patterns        are formed in addition containing one parameter each and all the        commands which can be described by that parameter:    -   {c₁, c₂, p₁},    -   {c₁, c₃, p₂}.    -   {c₁, c₂, c₃, p₅},

After this, on the basis of the patterns so formed, behavior patternsare formed in addition containing several parameters each and all thecommands which can be described by those parameters at the same time:

-   -   {c₁, c₂, p₁, p₅}.

In one variant aspect of the system, the commands and parameters arechosen from the behavior log on the basis of rules by which are selectedat least:

-   -   every i-th command in succession and the parameters describing        it, the increment “i” being specified in advance;    -   the commands executed after a predetermined period of time (for        example, every tenth second) from the previous selected command,        and the parameters describing them;    -   the commands and the parameters describing them that are        executed in a predetermined time interval from the start of        execution of the file;    -   the commands from a predetermined list and the parameters        describing them;    -   the parameters from a predetermined list and the commands        described by those parameters;    -   the first or the random k parameters of commands in the case        when the number of command parameters is greater than a        predetermined threshold value.

For example, from the behavior log one selects all the commands forworking with a hard disk (such as CreateFile, ReadFile, WriteFile,DeleteFile, GetFileAttribute and so on) and all the parametersdescribing the selected commands.

In yet another example, from the behavior log one selects everythousandth command and all the parameters describing the selectedcommands

In one variant aspect of the system, the behavior logs are formed inadvance from at least two files, one of which is a safe file and theother a malicious file.

In yet another variant aspect of the system, each element of thebehavior pattern is matched up with a characteristic such as the type ofelement of the behavior pattern. The type of element of the behaviorpattern (command or parameter) is at least:

-   -   a “number range”, if the element of the behavior pattern can be        expressed as a number    -   for example, for an element of the behavior pattern constituting        the parameter port_(html)=80 of the connect command, the type of        said element of the behavior pattern may be a “number value from        0x0000 to 0xFFFF”,    -   a “string”, if the element of the behavior pattern can be        expressed in the form of a string,    -   for example, for an element of the behavior pattern constituting        the connect command, the type of said element of the behavior        pattern may be a “string less than 32 characters in size”,    -   if the element of the behavior pattern can be expressed in the        form of data described by a predetermined data structure, the        type of that element of the behavior pattern may be a “data        structure”    -   for example, for an element of a behavior pattern constituting        the parameter src=0x336b9a480d490982cdd93e2e49fdeca7 of the find        record command, the type of this element of the behavior pattern        may be the “data structure MD5”.

In yet another variant aspect of the system, the behavior patternadditionally includes, as elements of the behavior pattern, tokensformed on the basis of lexical analysis of said elements of the behaviorpattern with the use of at least:

-   -   predetermined rules for the formation of lexemes,    -   a previously trained recurrent neural network.

For example, with the aid of lexical analysis of the parameter

-   -   ‘c:\windows\system32\data.pass’

on the basis of the rules for formation of lexemes:

-   -   if the string contains the path to a file, determine the disk on        which the file is located;    -   if the string contains the path to a file, determine the folders        in which the file is located;    -   if the string contains the path to a file, determine the file        extension;

where the lexemes are:

-   -   the paths to the file;    -   the folders in which the files are located;    -   the names of the files;    -   the extensions of the files;

the tokens can be formed:

-   -   “paths to the file”→    -   ‘c:\’,    -   “folders in which the files are located”→    -   ‘windows’,    -   ‘system32’,    -   ‘windows\system32’,    -   “extensions of the files”→    -   ‘.pass’.

In yet another example, with the aid of lexical analysis of theparameters

-   -   ‘81.19.82.8’, ‘81.19.72.38’, ‘81.19.14.32’

on the basis of the rule for formation of lexemes:

-   -   if the parameters constitute IP addresses, determine the bit        mask (or its analog, expressed by meta-characters) describing        said IP addresses (i.e., the bit mask M for which the equality        M{circumflex over ( )}IP=const is true for all those IPs);

the token can be formulated:

-   -   ‘81.19.*.*’.

In yet another example, from all available parameters comprisingnumbers, the tokens of the numbers are formed in predetermined ranges:

-   -   23, 16, 7224, 6125152186, 512, 2662162, 363627632, 737382, 52,        2625, 3732, 812, 3671, 80, 3200

sorting is done by ranges of numbers:

-   -   from 0 to 999        -   → {16, 23, 52, 80, 512, 812},    -   from 1000 to 9999        -   → {2625, 3200, 3671, 7224},    -   from 10000 on        -   {737382, 2662162, 363627632, 6125152186}

In yet another variant aspect of the system, tokens are formed fromelements of a behavior pattern which consist of strings.

For example, the behavior pattern is a path to a file containing thenames of the disk, the directory, the file, the file extension, and soforth. In this case, the token may be the name of the disk and the fileextension.

C:\Windows\System32\drivers\acpi.sys

-   -   →        -   C:\        -   *.sys

In one aspect, the convolution function forming module 122 is configuredto:

-   -   form a convolution function from the behavior pattern such that        the inverse convolution function of the result of that        convolution function performed on the obtained behavior pattern        will have a degree of similarity with the obtained behavior        pattern greater than a specified value, i.e.:        r˜g ⁻¹(g(r))    -   where:        -   r_(i) is the behavior pattern,        -   g is the convolution function,        -   g⁻¹ is the inverse convolution function    -   send the convolution function so formed to the detection model        machine learning module 132.

In one variant aspect of the system, the convolution function formingmodule 122 is additionally configured to:

-   -   calculate the feature vector of a behavior pattern on the basis        of the obtained behavior pattern, wherein the feature vector of        the behavior pattern may be expressed as the sum of the hash        sums of the elements of the behavior pattern; and/or    -   form a convolution function from the feature vector of the        behavior pattern, where the convolution function constitutes a        hash function such that the degree of similarity of the        calculated feature vector and the result of the inverse hash        function of the result of that hash function of the calculated        feature vector is greater than a predetermined value.

In yet another variant aspect of the system, the convolution function isformed by the metric learning method, i.e., such that the distancebetween the convolutions obtained with the aid of said convolutionfunction for behavior patterns having a degree of similarity greaterthan a predetermined threshold value is less than a predeterminedthreshold value, while for behavior patterns having a degree ofsimilarity less than the predetermined threshold value it is greaterthan the predetermined threshold value.

For example, the feature vector of the behavior pattern may becalculated as follows:

-   -   first an empty bit vector is created, consisting of 100000        elements (where one bit of information is reserved for each        element of the vector);    -   1000 elements from the behavior pattern γ are set aside for        storing of data about the commands c_(i), the remaining 99000        elements are set aside for the parameters c_(i) of the behavior        pattern γ, wherein 50000 elements (from element 1001 to        element 51000) are set aside for string parameters and 25000        elements (from element 51001 to element 76000) for number        parameters;    -   each command c_(i) of the behavior pattern γ is matched up with        a certain number x_(i) from 0 to 999, and the corresponding bit        is set in the vector so created    -   v[x_(i)]=true;    -   for each parameter p_(i) of the behavior pattern γ the hash sum        is calculated by the formula:        for strings: y _(i)=1001+crc32(p _(i))(mod 50000)        for numbers: y _(i)=51001+crc32(p _(i))(mod 25000)        for the rest: y _(i)=76001+crc32(p _(i))(mod 24000),    -   and depending on the calculated hash sum the corresponding bit        is set in the created vector v[y_(i)]=true;

The described bit vector with the elements so set constitutes thefeature vector of the behavior pattern γ.

In yet another variant aspect of the system, the feature vector of thebehavior pattern is computed by the following formula:

$D = {\sum\limits_{i}{b^{i} \times {h\left( r_{i} \right)}}}$

where b is the base of the positional system of computation (forexample, for a binary vector b=2, for a vector representing a string,i.e., a group of characters, b=8), r_(i) is the i-th element of thebehavior pattern, h is the hash function, where 0≤h(r_(i))<b.

For example, the feature vector of the behavior pattern may be computedas follows:

-   -   first yet another empty bit vector is created (different from        the previous example), consisting of 1,000 elements (where one        bit of information is reserved for each element of the vector);    -   the hash sum for each pattern element r_(i) of the behavior        pattern γ is calculated by the formula:        x _(i)=2^(crc32(r) ^()(mod 1000))    -   and depending on the computed hash sum, the corresponding bit is        set in the created vector v[x_(i)]=true;

In yet another variant aspect of the system, the feature vector of thebehavior pattern constitutes a Bloom filter.

For example, the feature vector of the behavior pattern may be computedas follows:

-   -   first yet another empty vector is created (different from the        previous examples), consisting of 100000 elements;    -   at least two hash sums for each pattern element r_(i) of the        behavior pattern r are calculated by means of a set of hash        functions {h_(j)} by the formula:        x _(ij) =h _(j)(r _(i))        where:        h _(j)(r _(i))=crc32(r _(i)),        h _(j)(0)=const_(j)    -   and depending on the computed hash sums, the corresponding        elements are set in the created vector v[x_(ij)]=true.

In yet another variant aspect of the system, the size of the result ofthe formulated convolution function of the feature vector of thebehavior pattern is less than the size of that feature vector of thebehavior pattern.

For example, the feature vector constitutes a bit vector containing100000 elements, and thus having a size of 12,500 bytes, while theresult of the convolution function of said feature vector constitutes aset of 8 MD5 hash sums and thus has a size of 256 bytes, i.e., ^(˜)2% ofthe size of the feature vector.

In yet another variant aspect of the system, the degree of similarity ofthe feature vector and the result of the inverse hash function of theresult of said hash function of the calculated feature vectorconstitutes a number value in the range of 0 to 1 and is calculated bythe formula:

$w = \frac{\sum\left( {\left\{ {h\left( r_{i} \right)} \right\} ⩓ \left\{ g_{i} \right\}} \right)}{\sum\left\{ {h\left( r_{i} \right)} \right\}}${h(r _(i))}{circumflex over ( )}{g _(i) }∀{h(r _(i))}={g _(i)}

-   -   where: h(r_(i)){circumflex over ( )}g_(i) signifies the        congruence of h(r_(i)) with g_(i) and {h(r_(i))} is the set of        results of the hash functions of the elements of the behavior        pattern,    -   {g_(i)} is the set of results of the inverse hash function of        the result of the hash function of the elements of the behavior        pattern,    -   r_(i) is the i-th element of the behavior pattern,    -   h is the hash function,    -   w is the degree of similarity.

For example, the calculated feature vector constitutes the bit vector101011100110010010110111011111101000100011001001001001110101101101010001100110110100100010000001011101110011011011,the result of the convolution function of this feature vector is1010011110101110101, and the result of the inverse convolution functionof the above-obtained result is101011100100010010110111001111101000100011001001010001110101101101110001100110110100000010000001011101110011011011(where the underline denotes elements different from the featurevector). Thus, the similarity of the feature vector and the result ofthe inverse convolution function is 0.92.

In yet another variant aspect of the system, the aforementioned hashfunction using an element of the behavior pattern as a parameter dependson the type of element of the behavior pattern:h(r _(i))=h _(r) _(i) (r _(i)).

For example, in order to compute the hash sum of a parameter from thebehavior pattern constituting a string containing the path to the file,we use the hash function CRC32; for any other string, the Huffmanalgorithm; for a data set, the hash function MD5.

In yet another variant aspect of the system, the forming of theconvolution function of the feature vector of a behavior pattern is doneby an auto encoder, where the input data are the elements of thatfeature vector of the behavior pattern, and the output data are datahaving a coefficient of similarity to the input data greater than apredetermined threshold value.

In one aspect, the detection model creating module 131 is configured to:

-   -   create a detection model for malicious files, including at        least:

(1) selection of a method for machine learning of the detection model;

-   -   initialization of the parameters of the teaching model, where        the parameters of the teaching model initialized prior to the        start of the machine learning of the detection model are known        as hyper parameters;    -   dependent on the parameters of the files selected by the        training selection preparation module 111;    -   send the teaching model so created to the detection model        machine learning module 132.

For example, when selecting the method for machine learning of thedetection model, at first a decision is made whether an artificialneural network or a random forest should be used as the detection model,and then if the random forest is chosen one selects the separatingcriterion for the nodes of the random forest. Or, if an artificialneural network is chosen, a method of numerical optimization of theparameters of the artificial neural network is selected. The decision asto the choice of a particular method for machine learning is made on thebasis of the effectiveness of that method in the detecting of maliciousfiles (i.e., the number of errors of the first and second kind occurringin the detecting of malicious files) with the use of input data(behavior patterns) of a predetermined kind (i.e., the data structure,the number of elements of the behavior patterns, the performance of thecomputing device on which the search is conducted for malicious files,the available resources of the computing device, and so on).

In yet another example, the method for machine learning of the detectionmodel is selected on the basis of one or more of:

-   -   cross-testing, sliding check, cross-validation (CV);    -   mathematical validation of the criteria AIC, BIC and so on;    -   A/B testing, split testing;    -   stacking.

In yet another example, in the event of low performance of the computingdevice, a random forest is chosen, otherwise the artificial neuralnetwork is chosen.

In one variant aspect of the system, machine learning is performed for apreviously created untrained detection model (i.e., a detection model inwhich the parameters of that model cannot produce, on the basis ofanalysis of the input data, output data with accuracy higher than apredetermined threshold value).

In yet another variant aspect of the system, the method of machinelearning of the model of detection is at least:

-   -   decision tree-based gradient boosting;    -   the decision tree method;    -   the K-nearest neighbor (kNN) method;    -   the support vector machine (SVM) method.

In yet another variant aspect of the system, the detection modelcreating module 131 is additionally designed to create a detection model133 on demand from the detection model machine learning module 132,where certain hyper parameters and methods of machine learning arechosen to be different from the hyper parameters and machine learningmethods chosen for a previous detection model.

The detection model machine learning module 132 is configured to teachthe detection model, in which the parameters of the detection model arecomputed with the use of the obtained convolution function on theobtained behavior patterns, where the detection model constitutes a setof rules for computing the degree of harmfulness of a file on the basisof at least one behavior pattern with the use of the computed parametersof that detection model.

For example, the detection model is trained with a known set of filesselected by the training selection preparation module 111, wherein saidset of files contains 60% safe files and 40% malicious files.

In one variant aspect of the system, the degree of harmfulness of a fileconstitutes a numerical value from 0 to 1, where 0 means that the fileis safe, and 1 that it is malicious.

In yet another variant aspect of the system, a method of teaching thedetection model is chosen which ensures a monotonic change in the degreeof harmfulness of a file in dependence on the change in the number ofbehavior patterns formed on the basis of analysis of the behavior log.

For example, a monotonic change in the degree of harmfulness of a filemeans that, upon analyzing each subsequent behavior pattern, thecalculated degree of harmfulness will be not less than the previouslycalculated degree of harmfulness (for example, after analysis of the10th behavior pattern, the calculated degree of harmfulness is equal to0.2; after analysis of the 50th behavior pattern, it is 0.4; and afteranalysis of the 100th behavior pattern it is 0.7).

In yet another variant aspect of the system, the detection model machinelearning module 132 is additionally configured to:

-   -   perform a check (e.g., verification) of the trained model of        detection on the obtained behavior logs formed on the basis of        analysis of files from a test selection of files, in order to        determine the correctness of the determination of the        harmfulness of files from the test selection of files;    -   in event of a negative result of the check, send a request to        one or more of:        -   the training selection preparation module 111 to prepare a            selection of files different from the current one used for            the teaching of the detection model; and        -   the detection model creating module 131 to create a new            detection model, different from the current one.

The trained detection model is verified as follows. The detection model133 has been taught on the basis of a set of files, selected by thetraining selection preparation module 111. It is previously knownwhether the set of files were safe or malicious. In order to verify thatthe detection model 133 has been trained correctly, i.e., that thedetection model is able to detect malicious files and pass over safefiles, the model is verified. For this purpose, the detection model isused to determine whether files from another set of files selected bythe training selection preparation module 111 are malicious. Themaliciousness of these files is known in advance. After applying themodel to the new set of files, the system 100 determines how manymalicious files were “missed” and how many safe files were detected. Ifthe number of missed malicious files and detected safe files is greaterthan a predetermined threshold value, that detection model 133 isconsidered to be improperly trained and the detection model must beretrained using machine learning (for example, on another trainingselection of files, using values of the parameters of the detectionmodel different from the previous ones, and so forth).

For example, when performing the verification of the trained model, thesystem 100 verifies the number of errors of the first and second kind inthe detecting of malicious files from a test selection of files. If thenumber of such errors is greater than a predetermined threshold value, anew teaching and testing selection of files is selected and a newdetection model is created.

In yet another example, the teaching selection of files contained 10,000files, of which 8,500 were malicious and 1,500 were safe. After thedetection model was taught, the system verified the model on a testselection of files containing 1,200 files, of which 350 were maliciousand 850 were safe. According to the results of the verification, 15 outof 350 malicious files failed to be detected (4%), while 102 out of 850safe files (12%) were erroneously considered to be malicious. In theevent that the number of undetected malicious files exceeds 5% oraccidentally detected safe files exceeds 0.1%, the trained detectionmodel is considered to be improperly trained, according to one exemplaryaspect.

In one variant aspect of the system, the behavior log 115 of the system100 is additionally formed on the basis of a previously formed behaviorlog of the system and commands intercepted after the forming of saidbehavior log of the system.

For example, after the start of the execution of a file for which it isnecessary to pronounce a verdict as to the harmfulness or safety of thatfile, the intercepted executable commands and the parameters describingthe commands are recorded in the behavior log 115. On the basis of ananalysis of these commands and parameters, the degree of harmfulness ofthat file is calculated by the system 100. If no verdict was pronouncedas to the file being considered malicious or safe based on the resultsof the analysis, the system 100 continues intercepting commands. Theintercepted commands and the parameters describing them are recorded inthe old behavior log or in a new behavior log. In the first case, thedegree of harmfulness is calculated on the basis of an analysis of allcommands and parameters recorded in the behavior log, i.e., even thosepreviously used to calculate the degree of harmfulness.

In one aspect, the system 100 is configured to:

-   -   calculate the degree of harmfulness on the basis of the behavior        log obtained from the behavior log forming module 112, and the        detection model obtained from the detection model machine        learning module 132, the degree of harmfulness of a file being a        quantitative characteristic (for example, lying in the range        from 0— the file has only safe behavior—to 1—the file has        predetermined malicious behavior), describing the malicious        behavior of the executable file; and/or    -   send the calculated degree of harmfulness to determine resource        allocation.

The system 100 is also designed to, in one aspect, allocate computingresources of the computer system, on the basis of analysis of theobtained degree of harmfulness, for use in assuring the security of thecomputer system.

In one variant aspect of the system 100, the computing resources of thecomputer system include at least:

-   -   the volume of free RAM;    -   the volume of free space on the hard disks; and/or    -   the free processor time (quanta of processor time) which can be        spent on the antivirus scan (for example, with a greater depth        of emulation).

In yet another variant aspect of the system, the analysis of the degreeof harmfulness consists in determining the dynamics of the change in thevalue of the degree of harmfulness after each of the precedingcalculations of the degree of harmfulness and at least:

-   -   allocating additional resources of the computer system in event        of an increase in the value of the degree of harmfulness; and/or    -   freeing up previously allocated resources of the computer system        in event of a decrease in the value of the degree of        harmfulness.

FIG. 2 shows a structural diagram of a method 200 for machine learningof a model for detection of malicious files, in accordance withexemplary aspects of the present disclosure.

The structural diagram of the method 200 for machine learning of a modelfor detection of malicious files contains a step 211 in which teachingselections of files are prepared, a step 212 in which behavior logs areformed, a step 221 in which behavior patterns are formed, a step 222 inwhich convolution functions are formed, a step 231 in which a detectionmodel is created, a step 232 in which the detection model is trained, astep 241 in which the behavior of the computer system is tracked, a step242 in which the degree of harmfulness is calculated, and a step 243 inwhich the resources of the computer system are managed.

In step 211, the training selection preparation module 111 is used toselect at least one file from a database of files 113 according topredetermined criteria, wherein the teaching of the detection model willbe done in step 232 on the basis of the selected files.

In step 212, the behavior log forming module 112 is used:

-   -   to intercept at least one command at least during:    -   the execution of the file selected in step 211,    -   the emulation of the working of the file selected in step 211;    -   to determine for each intercepted command at least one parameter        describing that command;    -   to form, on the basis of the commands intercepted and the        parameters determined, a behavior log of the obtained file,        wherein the behavior log represents a set of intercepted        commands (hereinafter, the command) from the file, where each        command corresponds to at least one defined parameter describing        that command (hereinafter, the parameter).

In step 221, the behavior pattern forming module 121 forms at least onebehavior pattern on the basis of the commands and parameters selectedfrom the behavior log formed in step 212. The behavior log 115represents a group of executable commands (hereinafter, the command)from the file, where each command corresponds to at least one parameterdescribing that command (hereinafter, the parameter). The behaviorpattern is a set of at least one command and such a parameter, whichdescribes all the commands from that set, in one aspect.

In step 222, the convolution function forming module 122 forms aconvolution function of the behavior pattern formed in step 221 so thatthe inverse convolution function of the result of this convolutionfunction performed on that behavior pattern will have a degree ofsimilarity to the aforementioned behavior pattern greater than aspecified value.

In step 231, the detection model creating module 131 creates a detectionmodel, that comprises one or more of:

-   -   selecting a method of machine learning of the detection model;    -   initializing the parameters of the teaching model, where the        parameters of the teaching model initialized prior to the start        of the machine learning of the detection model are known as        hyper parameters;    -   dependence on the parameters of the files selected in step 211.

In step 232, the detection model machine learning module 132 teaches thedetection model created in step 231. The parameters of that detectionmodel are calculated with the use of the convolution function formed instep 222, performed on the behavior patterns formed in step 221. Thedetection model constitutes a group of rules for calculating the degreeof harmfulness of a file on the basis of at least one behavior patternwith the use of the calculated parameters of that detection model.

In step 241, the behavior log forming module 112:

-   -   intercepts at least one command being executed by the files        running in the computer system;    -   forms a behavior log of the system on the basis of the        intercepted commands.

In step 242, the degree of harmfulness is calculated on the basis of thebehavior log of the system formed in step 241, and the detection modeltrained in step 232.

In step 243, the computing resources are allocated on the basis of theanalysis of the degree of harmfulness as calculated in step 242, for usein assuring the security of the computer system.

FIG. 3 shows examples of the dynamics of change in the degree ofharmfulness as a function of the number of behavior patterns.

The graph 311 illustrates the dynamics of change in the degree ofharmfulness as a function of the number of behavior patterns contain agraph of the dynamics of an arbitrary change in the degree ofharmfulness as a function of the number of behavior patterns formedduring the execution of a malicious file. The graph 312 illustrates thedynamics of monotonic change in the degree of harmfulness as a functionof the number of behavior patterns formed during the execution of amalicious file. The graph 321 illustrates the dynamics of an arbitrarychange in the degree of harmfulness as a function of the number ofbehavior patterns formed during the execution of a safe file. The graph322 illustrates the dynamics of monotonic change in the degree ofharmfulness as a function of the number of behavior patterns formedduring the execution of a safe file.

In one variant aspect of the system, the degree of harmfulness of anexecutable file takes on a value in the range of 0 (the file hasexclusively safe behavior) to 1 (the file has predetermined maliciousbehavior).

The graph 311 shows the dynamics of an arbitrary change in the degree ofharmfulness as a function of the number of behavior patterns formedduring the execution of a malicious file.

In the beginning, upon executing that file, the number of behaviorpatterns formed is not large, and what is more the malicious activity ofthe executable file might be absent or minimal (for example, aninitialization of data occurs, which is natural to many files, includingsafe ones), so that the calculated degree of harmfulness differsslightly from 0 and does not exceed a predetermined threshold value(hereinafter, the criterion of safety), after passing which the behaviorof the executable file ceases to be considered safe (on the graph, thisthreshold value is designated by a dotted line).

However, as time goes on the malicious activity of the executable filegrows and the degree of harmfulness begins to approach 1, surpassing thecriterion of safety, while the degree of harmfulness might not reach thepredetermined threshold value (hereinafter, the criterion ofharmfulness) after the passing of which the behavior of the executablefile will be considered to be malicious (in the graph, this thresholdvalue is designated by a dashed line).

After a period of growth, the malicious activity may cease and thedegree of harmfulness will again tend toward 0 (time A). At a certaintime, the degree of harmfulness will become greater than the criterionof harmfulness (time B) and the behavior of the executable file will berecognized as malicious and in consequence the file itself will berecognized as malicious.

The time of recognizing the file as malicious might occur significantlylater than the start of growth in malicious activity, since thedescribed approach responds well to an abrupt growth in the degree ofharmfulness, which occurs most often during prolonged, clearlymanifested malicious activity of the executable file.

In the event that the malicious activity occurs episodically (left sideof the graph 311), the calculated degree of harmfulness might not reachthe value/threshold after which a verdict is pronounced as to theharmfulness of the behavior of the executable file, and consequently theharmfulness of the executable file itself.

In the case when the degree of harmfulness is not calculated on thebasis of each behavior pattern formed (for example, because theperformance of the computing device is not high), a situation ispossible where the degree of harmfulness will be calculated at time A(when the malicious activity commences) and time C (when the maliciousactivity ends), but will not be calculated at time B (when maliciousactivity is occurring). The calculated degrees of harmfulness will notexceed the criterion of harmfulness, the activity of the executable filewill not be recognized as malicious, and consequently the malicious filewill not be detected.

The graph 312 shows the dynamics of monotonic change in the degree ofharmfulness as a function of the number of behavior patterns formedduring the execution of a malicious file.

In the beginning, upon executing said file, the number of behaviorpatterns formed is not large, and what is more the malicious activity ofthe executable file might be absent or minimal (for example, aninitialization of data occurs, which is natural for many files,including safe ones), so that the calculated degree of harmfulnessdiffers little from 0 and does not exceed the predetermined thresholdvalue (hereinafter, the criterion of safety), after passing which thebehavior of the executable file ceases to be considered safe (on thegraph, this threshold value is designated by a dotted line).

However, as time goes on the malicious activity of the executable filegrows and the degree of harmfulness begins to approach 1, surpassing thecriterion of safety, while the degree of harmfulness might not reach apredetermined threshold value (hereinafter, the criterion ofharmfulness) after the passing of which the behavior of the executablefile will be considered to be malicious (in the graph, this thresholdvalue is designated by a dashed line).

After a period of growth (times A-B), the malicious activity may cease(times B-A) yet the degree of harmfulness will not decline, but onlycontinue to grow during any malicious activity of the executable file.At a certain time, the degree of harmfulness will become greater thanthe criterion of harmfulness (time D) and the behavior of the executablefile will be recognized as malicious and in consequence the file itselfwill be recognized as malicious.

The time of recognizing the file as malicious might occur immediatelyafter the manifesting of malicious activity, since the describedapproach responds well to a smooth growth in the degree of harmfulness,which occurs both during prolonged, clearly manifested maliciousactivity of the executable file, and during frequent, episodic, lesspronounced malicious activity.

In the event that the malicious activity occurs episodically (left sideof the graph 312), the calculated degree of harmfulness over time mightreach the value after which a verdict is pronounced as to theharmfulness of the behavior of the executable file and the harmfulnessof the executable file itself.

In the case when the degree of harmfulness is calculated not on thebasis of each behavior pattern formed (for example, because theperformance of the computing device is not high), a situation ispossible where the degree of harmfulness will be calculated at time A(when the malicious activity commences) and time C (when the maliciousactivity ends), but will not be calculated at time B (when maliciousactivity is occurring), nevertheless since the degree of harmfulnesschanges monotonically, the calculated degrees of harmfulness will onlyincrease their values and at time C the degree of harmfulness willexceed the criterion of harmfulness, the activity of the executable filewill be recognized as malicious, and consequently the malicious filewill be detected.

The graph 321 shows the dynamics of an arbitrary change in the degree ofharmfulness as a function of the number of behavior patterns formedduring the execution of a safe file.

In the beginning, upon executing said file, the number of behaviorpatterns formed is not large, and what is more there is no maliciousactivity as such from the executable file, although “suspicious” actionsmight be executed, which may also be performed during the execution ofmalicious files (for example, deletion of files, transfer of data in acomputer network, and so on), therefore the calculated degree ofharmfulness differs from 0 and does not exceed a predetermined thresholdvalue (hereinafter, the criterion of safety), after passing which thebehavior of the executable file ceases to be considered safe (on thegraph, this threshold value is designated by a dotted line).

However as time goes on, the malicious activity of the executable filegrows because of the execution of a large number of “suspicious”commands. The degree of harmfulness begins to approach 1 and while thedegree of harmfulness might not reach a predetermined threshold value(hereinafter, the criterion of harmfulness) after the passing of whichthe behavior of the executable file will be considered to be malicious(in the graph, this threshold value is designated by a dashed line), itmay exceed the criterion of safety, so that the file may cease to beconsidered safe and become “suspicious”.

After a period of growth, the malicious activity may cease and thedegree of harmfulness will again tend toward 0 (time C).

In the case when the degree of harmfulness is not calculated on thebasis of each behavior pattern formed (for example, because theperformance of the computing device is not high), a situation ispossible where the degree of harmfulness will be calculated at time B(when the activity is most similar to malicious, i.e., becomes“suspicious”) but not at time A (when the “suspicious” activityincreases) or at time C (when the “suspicious” activity is decreasing).In this situation, the calculated degree of harmfulness will exceed thecriterion of safety and the activity of the executable file will berecognized as “suspicious” (it will not be considered safe), andconsequently the file previously considers safe will not be recognizedas safe.

The graph 322 shows the dynamics of monotonic change in the degree ofharmfulness as a function of the number of behavior patterns formedduring the execution of a safe file.

In the beginning, upon executing said file, the number of behaviorpatterns formed is not large. Furthermore there is no malicious activityfrom the executable file, although “suspicious” actions might beexecuted, which may also be performed during the execution of maliciousfiles (for example, deletion of files, transfer of data in a computernetwork, and so on). Therefore the calculated degree of harmfulness isnot 0 and does not exceed a predetermined threshold value (hereinafter,the criterion of safety). If the degree of harmfulness exceeded thecriterion of safety, the behavior of the executable file ceases to beconsidered safe (on the graph, this threshold value is designated by adotted line).

However, as time goes on the malicious activity of the executable filegrows on account of the execution of a large number of “suspicious”commands and the degree of harmfulness begins to approach 1. The degreeof harmfulness might not reach a predetermined threshold value(hereinafter, the criterion of harmfulness) after the passing of whichthe behavior of the executable file will be considered to be malicious(in the graph, this threshold value is designated by a dashed line).Also the criterion for harmfulness might not exceed the criterion ofsafety, therefore the file will continue to be considered safe.

After a period of growth (times A-B), the malicious activity may cease(times B-A) yet the degree of harmfulness will not decline. Instead, thedegree of harmfulness continues to grow during any malicious activity ofthe executable file, yet does not exceed the coefficient of safety. Inthis manner, the activity of the executable file will be regarded assafe and in consequence the file will be regarded as safe.

When the degree of harmfulness is calculated not on the basis of eachbehavior pattern formed (for example, because the performance of thecomputing device is not high), a situation is possible where the degreeof harmfulness will be calculated at time B (when the activity is mostsimilar to malicious, i.e., becomes “suspicious”) but not at time A(when the “suspicious” activity increases) or at time C (when the“suspicious” activity is decreasing). Nevertheless since the degree ofharmfulness changes monotonically, the calculated degrees of harmfulnesswill only increase their values, at times A, B, C the degrees ofharmfulness will not exceed the criterion of safety, the activity of theexecutable file will be recognized as safe, and consequently the safefile will be recognized as safe.

The file may not be recognized as “suspicious” after “suspicious”activity has manifest itself. Since the described approach affords asmooth growth in the degree of harmfulness, this makes it possible toavoid sharp peaks in the growth of the degree of harmfulness.

FIG. 4 shows an example of a diagram of relations between elements ofbehavior patterns, in accordance with exemplary aspects of the presentdisclosure.

The example of the diagram of relations between elements of behaviorpatterns contains commands 411 (clear circles), parameters 412 (hatchedcircles), an example of a behavior pattern with one parameter 421 and anexample of a behavior pattern with one command 422.

During the execution of a file, the commands 411 were intercepted andthe parameters 412 describing them were determined:

-   -   CreateFile 0x24e0da54‘.dat’    -   {c1, p1, p2}    -   ReadFile 0x24e0da54‘.dat’    -   {c2, p1, p2}    -   DeleteFile 0x24e0da54 ‘.dat’‘c:\’    -   {c3, p1, p2, p3}    -   CreateFile 0x708a0b32 ‘.dat’ 0x3be06520    -   {c1, p2, p3, p5}    -   WriteFile 0x708a0b32    -   {c4, p3}    -   WriteFile 0x708a0b320x3be065200x9902a18d1718b5124728190    -   {c4, p3, p5, p6, p7}    -   CopyMemory 0x3be065200x9902a18d1718b512472819    -   {c5, p4, p5, p6}    -   ReadFile 0x9902a18d1718b5124728f90    -   {c2, p6, p7}

On the basis of those commands 411 and parameters 412, behavior patterns(421, 422) are formed and the relations between the elements of thebehavior patterns are determined.

In a first step, patterns are formed containing one command 411 and oneparameter 412 describing that command:

{c1, p1} {c3, p2} {c1, p2} {c3, p3} {c1, p3} {c4, p3} {c1, p5} {c4, p5}{c2, p1} {c4, p6} {c2, p2} {c4, p7} {c2, p6} {c5, p4} {c2, p7} {c5, p5}{c3, p1} {c5, p6}

In the example shown, 19 behavior patterns have been formed on the basisof 8 intercepted commands (with the parameters describing them).

In the second step, patterns are formed which contain one parameter 412and all the commands 411 which can be described by that parameter 412:

{c1, c2, c3, p1} {c1, c3, c4, p3} {c1, c2, c3, p2} {c5, p4} {c1, c4, c5,p5} {c2, c4, p7} {c2, c4, c5, p6}

In the example shown, 7 behavior patterns have been formed in additionon the basis of 8 intercepted commands (with the parameters describingthem).

In the third step, patterns are formed which contain several parameters412 and all the commands 411 which can be described by those patterns412:

{c1, c2, c3, p1, p2} {c2, c4, p6, p7} {c4, c5, p5, p6}In the example given, three behavior patterns have been formed inaddition on the basis of eight intercepted commands (with the parametersdescribing them).

FIG. 5 shows a structural diagram of a system of detection of maliciousfiles with the use of a trained model of detection of malicious files,in accordance with exemplary aspects of the present disclosure.

The structural diagram of the system 500 of detection of malicious fileswith the use of a trained model of detection of malicious files consistsof the file being analyzed 501, a behavior log forming module 112, adetection model selection module 520, a database of detection models521, a behavior log analysis module 530, a harmfulness module 540, adatabase of decision templates 541 and an analysis module 550.

In one variant aspect of the system, the system additionally contains abehavior log forming module 112 of the file being executed, which isconfigured to:

-   -   intercept at least one command at least during:        -   a) the execution of the file 501; and/or        -   b) the emulation of the execution of the file 501;    -   determine for each intercepted command at least one parameter        describing that command;    -   form on the basis of the intercepted commands and the determined        parameters a behavior log for that file, where the intercepted        commands and the parameters describing them are recorded in the        behavior log in chronological order from the earliest        intercepted command to the most recent intercepted command        (hereinafter, writing in the behavior log);    -   send the formulated behavior log to the behavior log analysis        module 530 and the detection model selection module 520.

In yet another variant aspect of the system 500, the behavior log is aset of executable commands (hereinafter: command) of the file 501, whereeach command corresponds to at least one parameter describing thatcommand (hereinafter: parameter).

In yet another variant aspect of the system, the intercepting ofcommands of the file being executed 501 and the determination of theparameters of the intercepted commands is performed on the basis of ananalysis of the performance of the computing device on which the systemfor detecting of malicious files with the use of a trained model ofdetection of malicious files is running, including at least:

-   -   a determination as to whether it is possible to analyze the file        being executed 501 for harmfulness (carried out with the aid of        the analysis behavior log analysis module 530, the harmfulness        module 540 and the analysis module 550) up to the time when the        next command will be intercepted;    -   a determination as to whether the analysis of the file being        executed 501 for harmfulness will result in a lowering of the        computing resources of that computing device below a        predetermined threshold value, the resources of the computing        device being at least:        -   the performance of that computing device;        -   the volume of free RAM of that computing device;        -   the volume of free space on information storage media of            that computing device (such as hard disks);        -   the bandwidth of the computer network to which that            computing device is connected.

In order to increase the performance of the system of detection ofmalicious files with the use of a trained model of detection ofmalicious files it may be necessary to analyze a behavior log notcontaining all the executable commands of the file being executed 501,since the entire sequence of actions carried out to analyze the file 501for harmfulness takes up more time than the interval between twoconsecutively executed commands of the file being executed 501.

For example, the commands of the file being executed 501 are carried out(and consequently intercepted) every 0.001 s, but the analysis of thefile 501 for harmfulness takes 0.15 s, so that all the commandsintercepted during that interval of time will be ignored, and thus it isenough to intercept only every 150th command.

In one aspect, the detection model selection module 520 is configuredto:

-   -   select from the database of detection models 521 at least two        models of detection of malicious files on the basis of the        commands and parameters selected from the behavior log of the        file being executed 501, the model of detection of malicious        files being a decision-making rule to determine the degree of        harmfulness    -   send all selected models of detection of malicious files to the        harmfulness module 540.

In one variant aspect of the system, the models of detection ofmalicious files kept in the database of detection models 521 have beenpreviously trained by the method of machine learning on at least onesafe file and at least one malicious file.

The model of detection of malicious files is described in greater detailin FIG. 1 to FIG. 4.

In yet another variant aspect of the system 500, the method of machinelearning of the detection model is at least:

-   -   decision tree-based gradient boosting;    -   the decision tree method;    -   the K-nearest neighbor (kNN) method;    -   the support vector machine (SVM) method

In yet another variant aspect of the system, the method of teaching themodel for detection ensures a monotonic variation in the degree ofharmfulness of the file in dependence on the variation in the number ofbehavior patterns formulated on the basis of the analysis of thebehavior log.

For example, the calculated degree of harmfulness of the file 501 mightonly increase monotonically or not change in dependence on the number ofbehavior patterns formed on the basis of the analysis of the behaviorlog of that file 501. At the start of the execution of the file 501, thenumber of behavior patterns formed is insignificant, and the calculateddegree of harmfulness of that file 501 will differ little from 0. Astime goes on, the number of patterns formed will increase and thecalculated degree of harmfulness of that file 501 will also increase, orif there is no malicious activity of that file 501 the calculated degreeof harmfulness will remain unchanged. Thus, whenever the degree ofharmfulness of the file is calculated during the execution of amalicious file 501 (or from whatever record of the behavior log theforming of the behavior patterns began), it will reflect whethermalicious activity of the file 501 has occurred or not up to the time ofcalculation of that degree of harmfulness.

In yet another variant aspect of the system, each model of detection ofmalicious files selected from the database of detection models 521 istrained to detect malicious files with predetermined uniquecharacteristic features.

For example, the detection models kept in the database of detectionmodels 521 may be trained to detect files:

-   -   having a GUI—graphical user interface;    -   exchanging data in a computer network;    -   encrypting files (such as malicious files of the        “Trojan-Cryptors” family);    -   using network vulnerabilities for their propagation (such as        malicious files of the “Net-Worms” family), P2P networks (such        as malicious files of the “P2P-Worms” family), and so forth.

Thus, the malicious file may be detected with the use of several trainedmodels for detection of malicious files. For example, the malicious file“WannaCry.exe” which when executed encrypts data on a user's computingdevice and sends copies of itself to other computing devices connectedto the same computer network as the user's computing device on which thefile was executed can be detected with the aid of detection model #1,trained to detect files utilizing vulnerabilities, detection model #2,trained to detect files designed to encrypt files, and detection model#3, trained to detect files containing text information interpretable asdemands being made (for example, as to a form of payment, sums of money,and so forth). The degrees of harmfulness calculated with the aid ofthose models, as well as the times when the calculated degrees ofharmfulness surpass the predetermined threshold value, might differ fromeach other. For example, the results of using the models for detectionof malicious files by means of which it was possible to detect themalicious file 501 may be expressed in the following table:

TABLE #1 limit degree command No. detection model of harmfulness frombehavior log model #1 0.95 374 model #2 0.79 288 model #3 0.87 302

File 501 is recognized as malicious in the event that the calculateddegree of harmfulness is greater than 0.78. The degree of harmfulness(such as 0.78) characterizes the probability that the file for which thedegree of harmfulness was calculated may prove to be malicious (78%) orsafe (22%). If the file 501 can be recognized as being malicious withthe use of several models for detection of malicious files, there is ahigher probability that the file 501 will prove to be malicious. Forexample, for the models of detection of malicious files whose data ispresented in Table #1, the summary degree of harmfulness can becalculated by the formula:w _(total)=1−Π_(i) ^(n)(1−w _(i))=0.999685,

where

-   -   w_(total)—is the summary degree of harmfulness,    -   w_(i)—is the degree of harmfulness calculated with the use of        model i,    -   n—is the number of models for detection of malicious files used        to calculate the summary degree of harmfulness.

Thus, the obtained summary degree of harmfulness (0.999685) issignificantly higher than the predetermined threshold value, thecalculated degree of harmfulness passing this threshold (0.78) meaningthe file is recognized as malicious. That is, the use of several modelsfor detection of malicious files allows substantially higher accuracy ofdetermination of malicious files, and fewer errors of the first andsecond kind occurring during the detecting of malicious files.

In yet another example, the use of several models for detecting ofmalicious files allows the summary degree of harmfulness to attain thepredetermined threshold value, a calculated degree of harmfulness beyondthis value meaning that a file is recognized as malicious much soonerthan when using each of the models for detecting of malicious filesseparately. For example, for the models for detecting of malicious fileswhose data are presented in Table #1, given that the calculated degreesof harmfulness vary monotonically, the number of the command from thebehavior log after which the file will be recognized as malicious can becalculated by the formula:I _(detect)=Π_(i) ^(n) F(w _(i) ,I _(i))=207,

where

-   -   I_(detect)—is the number of the command from the behavior log        after analysis of which the file will be recognized as        malicious,    -   I_(i)—is the number of the command from the behavior log after        analysis of which using the model i the file will be recognized        as malicious,    -   w_(i)—is the degree of harmfulness calculated with the use of        model i,    -   n—is the number of models for detection of malicious files used        to calculate the number of the command from the behavior log        after analysis of which the file will be recognized as        malicious.

Thus, the obtained summary number of the command from the behavior log(207) is much lower than the earliest number of the command from thebehavior log (288) after analysis of which the file was recognized asmalicious by one of the models for detection of malicious files (model#2). That is, the use of several models for detection of malicious filesmay substantially increase the speed (i.e., the efficiency) of detectionof malicious files.

In yet another example, the different detection models kept in thedatabase of detection models 521 may be trained to detect maliciousfiles with several not necessarily unique predetermined characteristicfeatures, i.e., the detection model #1 can detect files having agraphical user interface and exchanging data in a computer network,while model #2 can detect files exchanging data in a computer networkand propagating in that computer network with the use of networkvulnerabilities. Both of those detection models can detect theaforementioned malicious file “WannaCry.exe” thanks to the commoncharacteristic trait of the file propagating in a computer network withthe use of network vulnerabilities.

In yet another variant aspect of the system, one selects from thedatabase of detection models 521 a model for detection of maliciousfiles that was trained on files during whose execution there occurs atleast:

-   -   the execution of the same commands as the commands selected from        the behavior log of the file being executed 501;    -   the utilization of the same parameters as the parameters        selected from the behavior log of the file being executed 501.

For example, from the behavior log there are selected the commands“CreateFileEx”, “ReadFile”, “WriteFile”, and “CloseHandle”, which areused for the modification of files, including the encrypting of files.From the database of detection models 521 there will be selected adetection model trained for use in detecting malicious files of the“Trojan-Cryptors” family.

In yet another example, from the behavior log there are selected theparameters “8080” and “21”, which describe commands working with acomputer network (for example, connect, where the above describedparameters are connection ports to an electronic address). From thedatabase of detection models 521 there will be selected a detectionmodel trained for use in detecting files exchanging data in a computernetwork.

In this aspect, the behavior log analysis module 530 is configured to:

-   -   form at least one behavior pattern on the basis of the commands        and parameters selected from the behavior log of the file being        executed 501, where the behavior pattern is a set of at least        one command and a parameter which describes all the commands        from that set;    -   compute the convolution of all the behavior patterns formed;    -   send the formed convolution to the harmfulness module 540 of the        file being executed.

In one variant aspect of the system, the calculating of the convolutionof the formed behavior patterns is based on a predetermined convolutionfunction, such that the inverse convolution function of the result ofthat convolution function performed on all of the formed behaviorpatterns has a degree of similarity to that behavior pattern greaterthan a predetermined threshold value.

The formation and use of the convolution functions (calculation of theconvolution) is described more closely in FIG. 1 and FIG. 2.

In one aspect, the harmfulness module 540 is designed to:

-   -   calculate the degree of harmfulness of the file being executed        501 on the basis of an analysis of the obtained convolution with        the aid of each obtained model of detection of malicious files;    -   send each calculated degree of harmfulness to the analysis        module 550.

In one variant aspect of the system, the decision making template is acomposition of the degrees of harmfulness.

For example, the composition of the degrees of harmfulness calculated onthe basis of the models #1, #2, #3, described above, can be representedas an aggregate of pairs {0.95, 374}, {0.79, 288}, {0.87, 302}.

In yet another example, the composition of the degrees of harmfulnesscalculated on the basis of the models #1, #2, #3, described above, canrepresent a measure of the central tendency of the calculated degrees ofharmfulness (such as the arithmetic mean, in the given case 0.87).

In yet another example, the composition of the degrees of harmfulnessrepresents the change in the degrees of harmfulness as a function oftime or the number of behavior patterns used to calculate the degree ofharmfulness.

In one aspect, the analysis module 550 is designed to:

-   -   form a decision making template on the basis of the obtained        degrees of harmfulness;    -   recognize the file being executed 501 as malicious in the event        that the degree of similarity between the formed decision making        template and at least one of the predetermined decision making        templates from a database of decision making templates 541,        previously formed on the basis of an analysis of malicious        files, is greater than a predetermined threshold value.

In one variant aspect of the system, the decision making template is anaggregate of the degrees of harmfulness obtained from the harmfulnessmodule 540.

In yet another variant aspect of the system, the decision makingtemplate is the degree of harmfulness as a function of time or thenumber of behavior patterns used to calculate that degree ofharmfulness.

In yet another variant aspect of the system, the decision makingtemplates from the database of decision making templates 541 are formedon the basis of an analysis of malicious files used for the training ofmodels from the database of detection models 521.

For example, on the basis of 100,000 files, of which 75,000 are safefiles and 25,000 are malicious files, the detection models are trained(including testing) and then saved in the database of detection models521. After the models for detection of malicious files have beentrained, they are used to form the decision making templates for some(or all) of the aforementioned 25,000 malicious files, which are thenentered into the database of decision making templates 541. That is, amachine learning of the models for detection of malicious files is firstcarried out on a teaching and testing sample of files. As a result,several models for detection of malicious files can be trained, each ofwhich will be trained for the detecting of malicious files with uniquepredetermined characteristic traits. After all of the detection modelshave been trained, one determines which of the trained models fordetecting of malicious files detect certain malicious files (in theabove described example, the 25,000 malicious files). It may turn outthat one malicious file can be detected with the use of one set ofmodels for detection of malicious files, another one with the use of asecond set of models for detection of malicious files, a third one withthe use of several models for the detection of malicious files from theaforementioned sets of models for the detection of malicious files. Thedecision making templates are formed on the basis of the data obtainedas to which models for detection of malicious files are able to detectwhich malicious files.

In yet another variant aspect of the system, the analysis module 550 isadditionally designed to retrain at least one detection model from thedatabase of detection models 521 on the basis of commands and parametersselected from the behavior log of the file being executed 501 in thecase when the degree of similarity between the formed decision makingtemplate and at least one of the predetermined decision making templatesfrom the database of decision making templates 541 exceeds apredetermined threshold value, while the degrees of harmfulnesscalculated with the aid of those models for detection of a maliciousfile do not exceed a predetermined threshold value.

FIG. 6 shows a flow diagram of a method for detection of malicious fileswith the use of a trained model of detection of malicious files, inaccordance with exemplary aspects of the disclosure.

The structural diagram of the method for detection of malicious fileswith the use of a trained model of detection of malicious files containsa step 610, in which the file being analyzed is executed, a step 620, inwhich a behavior log is formed, a step 630, in which behavior patternsare formed, a step 640, in which the convolution is calculated, a step650, in which a detection model is selected, a step 660, in which thedegree of harmfulness is calculated, a step 670, in which a decisionmaking template is formed, a step 680, in which the file is recognizedas malicious, and a step 690, in which the detection model is retrained.

In step 610, the behavior log forming module 112 is configured to atleast:

-   -   execute the file being analyzed 501;    -   emulate the execution of the file being analyzed 501.

In step 620, the behavior log forming module 112 forms a behavior logfor the file being analyzed 501, for which:

-   -   at least one command being executed is intercepted;    -   for each intercepted command at least one parameter is        determined describing that command;    -   the behavior log of that file 501 is formed on the basis of the        intercepted commands and the parameters so determined.

In step 630, the behavior log analysis module 530 is used to form atleast one behavior pattern on the basis of the commands and parametersselected from the behavior log of the file being executed 501. Thebehavior pattern is, in one aspect, a set of at least one command and aparameter which describes all the commands from that set.

In step 640, the behavior log analysis module 530 calculates theconvolution of all the behavior patterns formed in step 630.

In step 650, the detection model selection module 520 selects from thedatabase of detection models 521 at least two detection models formalicious files on the basis of the commands and parameters selectedfrom the behavior log of the file being executed 501. The detectionmodel of malicious files is, in one aspect, a decision making rule fordetermining the degree of harmfulness.

In step 660, the harmfulness module 540 calculates the degree ofharmfulness of the file being executed 501 on the basis of an analysisof the convolution calculated in step 640 with the aid of each detectionmodel for malicious files selected in step 650.

In step 670, the analysis module 550 forms a decision making template onthe basis of the degrees of harmfulness obtained in step 660.

In step 680, the analysis module 550 recognizes the file being executed501 as malicious in the event that the degree of similarity between thedecision making template formed in step 670 and at least one of thepredetermined decision making templates from the database of decisionmaking templates 541 exceeds a predetermined threshold value.

In step 690, the analysis module 550 is used to retrain at least onedetection model from the database of detection models 521 on the basisof the commands and parameters selected from the behavior log of thefile being executed, in the event that the degree of similarity betweenthe formed decision making template and at least one of thepredetermined decision making templates from the database of decisionmaking templates 541 exceeds a predetermined threshold value, while thedegrees of harmfulness calculated with the aid of those detection modelsfor a malicious file do not exceed a predetermined threshold value.

FIG. 7 shows an example of a system for detection of a malicious file,in accordance with exemplary aspects of the present disclosure.

A structural diagram of the system for detection of a malicious fileconsists of the file being analyzed 501, a behavior log forming module112, a database of detection models 521, a data collection module 710,data about the behavior of the file 711, a parameter calculating module720, a parameter calculation model 721, an analysis module 730, acriterion forming model 731, and a parameter correction module 740.

A more detailed description of the behavior log forming module 112, thefile 501, the models database 521, and the analysis module 730 (as avariant aspect of the analysis module 550) is disclosed in FIG. 1, FIG.2, FIG. 5 and FIG. 6.

The data collection module 710 is designed to form, based on data aboutthe execution behavior 711 of the file 501 gathered by the behavior logforming module 112, a vector of features characterizing that behavior,where the vector of features is a convolution of the collected data 711formed as an aggregate of numbers.

An example of the forming of the convolution of the collected data ispresented in the description of the working of the behavior patternforming module 121 in FIG. 1.

In one variant aspect of the system, the data on the execution behavior711 of the file 501 includes at least:

-   -   the commands contained in the file being executed 501 or        interpretable in the process of execution of the file 501,        attributes transmitted to those commands, and the values        returned;    -   data on the areas of RAM which can be modified during the        execution of the file 501;    -   the static parameters of the file 501.

For example, the commands may be either instructions (or groups ofinstructions) of the computer's processor or WinAPI functions orfunctions from third-party dynamic libraries.

In yet another example, the file 501 may contain unprocessed data (rawdata) which is interpreted in the course of execution of the file 501 asprocessor commands (or commands of a certain process, in the case of“dll” libraries) and/or parameters being transferred to the commands. Ina particular case, such data can be portable code.

In yet another example, the data of RAM areas may be:

-   -   the convolutions of those memory areas (for example, with the        use of fuzzy hashes);    -   the results of lexical analysis of those memory areas, on the        basis of which lexemes are extracted from the memory area and        statistics are gathered on their use (for example, the frequency        of use, the weighting characteristics, relations to other        lexemes, and so on);    -   static parameters of those memory areas, such as size, owner        (process), rights of use, and so forth.

The static parameters of the file 501 are parameters which characterizethe file (identify it) and which remain unchanged in the course of theexecution, the analysis, or the modification of that file 501, or whichcharacterize the file 501 up to the time of its execution.

In a particular instance, the static parameters of the file 501 maycontain information about the characteristics of its execution orbehavior (i.e., allowing a prediction of the result of the execution ofthe file 501).

In yet another example, the static parameters of the file are the sizeof the file, the time of its creation, modification, the owner of thefile, the source from which the file was obtained (electronic or IPaddress), and so forth.

In yet another variant aspect of the system, data on the executionbehavior 711 of the file 501 is gathered from various sources (inputdata channels), including at least:

-   -   the log of commands executed by the file being analyzed 501;    -   the log of commands executed by the operating system or        applications being executed under the control of the operating        system (except for the file being analyzed 501);    -   data obtained through the computer network.

In one aspect, the parameter calculating module 720 calculates, on thebasis of the feature vector formed by the data collection module 710 andusing the trained parameter calculation model 721, the degree ofharmfulness and the limit degree of safety. In exemplary aspects, thedegree of harmfulness is a numerical value characterizing theprobability that the file 501 may prove to be malicious and the limitdegree of safety is a numerical value characterizing the probabilitythat the file 501 will assuredly prove to be malicious whenpredetermined conditions are met. Depending on the degree of harmfulnessand the limit degree of safety (see FIG. 9), the aggregate of saiddegrees calculated in succession is described by a predetermined timelaw.

In one variant aspect of the system, for each channel of input data(source of input data or data from a source of output data filtered by apredetermined criterion) there is created a system for extraction offeatures (a vector of real numbers of length N):

-   -   if the given channel involves the consecutive obtaining of        information (for example, a log or sequence of unpacked        executable files), then a system is additionally created for        aggregation of the features for the input sequence as a single        vector;    -   a system is created to transform the features from the given        channel into a new vector of length K. The values in this vector        may only increase monotonically as new elements of the input        sequence are processed.

In yet another variant aspect of the system, the system for extraction,aggregation and transforming of features may depend on parameters forteaching, which will be attuned later on in the step of teaching theentire model:

-   -   vectors of length K, arriving from all active channels, are        monotonically aggregated into 1 vector of fixed length (for        example, the maximum is taken element by element); and/or    -   the aggregated monotonically increasing vector is transformed        into 1 real number, characterizing the suspiciousness of the        process being investigated (for example, the vector is        transformed by addition of elements of that vector or by        performing actions on the elements of the vector by a        predetermined algorithm, such as the calculating of the norm of        that vector).

In yet another variant aspect of the system, the parameter calculationmodel 721 has been previously trained by the method of machine learningon at least one safe file and at least one malicious file.

In yet another variant aspect of the system, the method of machinelearning of the parameter calculation model 721 is at least:

-   -   decision tree-based gradient boosting;    -   the decision tree method;    -   the K-nearest neighbor (kNN) method;    -   the support vector machine (SVM) method.

In yet another variant aspect of the system, at least the calculateddegree of harmfulness or limit degree of safety depend on the degree ofharmfulness and respectively the limit degree of safety calculated atthe time of launching of the file 501 on the basis of an analysis of thestatic data of the file 501.

For example, the degree of harmfulness and the limit degree of safetymay be calculated by the formulae:ω=ω₀+ω(t)φ=φ₀+φ(t)

where:

-   -   ω, φ—are the degree of harmfulness and the limit degree of        safety, respectively,    -   ω₀, φ₀—are the starting values of the degree of harmfulness and        the limit degree of safety not depending on the execution        parameters of the file 501, yet depending on external conditions        (the working parameters of the operating system and so forth),    -   ω(t), φ(t)—are the time laws used to calculate the degree of        harmfulness and the limit degree of safety, respectively.

Said time laws may be dependent on each other, i.e., on the previouslycalculated degree of harmfulness and limit degree of safety:ω(t _(n))=ω(t,φ(t _(n−1)))φ(t _(n))=φ(t,ω(t _(n−1)))

The above variant aspect of the system is disclosed in more detail inFIG. 9.

In yet another variant aspect of the system, the trained parametercalculation model 721 is an aggregate of rules for calculating thedegree of harmfulness of a file and the limit degree of safety of a filedependent on the data determined about the execution behavior 711 of thefile 501.

In yet another variant aspect of the system, the time laws describingthe aggregate of consecutively calculated degrees of harmfulness and theaggregate of consecutively calculated limit degrees of safety aremonotonic in nature.

For example, the curve of the change in the degree of harmfulness as afunction of time may be described by a monotonically increasing function(such as f (x)=ax+b).

In yet another variant aspect of the system, the time laws describingthe aggregate of consecutively calculated degrees of harmfulness and theaggregate of consecutively calculated limit degrees of safety have apiecewise monotonic nature, i.e., they have a monotonic nature forspecified time intervals.

Often during operation of the system being described, it is not possible(due to limitations on the computing resources, the computer time, thepresence of demands on minimal performance, etc.) to determineconstantly (continuously or with a given periodicity) the degree ofharmfulness. Therefore the degree of harmfulness and the limit degree ofsafety may be calculated over calculable intervals of time (notpredetermined ones, but intervals which can be calculated in the processof execution of the file 501). Such calculations are also based oncertain predetermined time laws, for which the input parameters arecalculated in the process of execution of the file, i.e., one may writefor the time of calculation of the file:t _(n)=τ(t _(n−1))

The time of calculation of the degree of harmfulness and the limitdegree of safety may depend on the previously calculated degree ofharmfulness and limit degree of safety:t _(n)=τ(t _(n−1),ω(t _(n−1)),φ(t _(n−1)))

For example, when the file 501 is launched, for the first 10 seconds thedegree of harmfulness of that file increases monotonically, after the10th second, the degree of harmfulness of that file is halved, and thenit begins to increase monotonically once again.

The analysis module 730 pronounces a decision on the detection of amalicious file 501 in the event that the data collected on the executionbehavior 711 of the file 501 meets a predetermined criterion for thefinding of harmfulness. The criterion is formulated on the basis of thedegree of harmfulness and the limit degree of safety as calculated bythe parameter calculating module 720. In one aspect, the criterion is arule for the classification of the file (provided by the criterionforming model 731) in terms of an established correlation between thedegree of harmfulness and the limit degree of safety.

In one variant aspect of the system, the correlation between the degreeof harmfulness and the limit degree of safety is at least:

-   -   the difference from a predetermined threshold value of the        distance between the degree of harmfulness and the boundary        conditions of harmfulness;    -   the difference from a predetermined threshold value of the areas        bounded in a given time interval between curves describing the        degree of harmfulness and the limit degree of safety;    -   the difference from a predetermined threshold value of the rates        of mutual increase of the curve describing the change in the        degree of harmfulness and the boundary conditions of harmfulness        as a function of time.

For example, the most characteristic instances of the describedcorrelation are depicted in FIG. 9.

The parameter correction module 740 is designed to retrain the parametercalculation model 721 on the basis of an analysis (see FIG. 9) of thecalculated degree of harmfulness and limit degree of safety. Once themodel 721 is retrained, changes in the time laws describing the degreeof harmfulness and the limit degree of safety cause the correlationbetween the values obtained with the use of those time laws to tendtoward a maximum.

In one variant aspect of the system, the parameter calculation model 721is retrained such that, when the model is used, the criterion formedafterwards ensures at least:

-   -   that the accuracy of determining the degree of harmfulness and        the limit degree of safety is greater than when using an        untrained model for calculation of parameters;    -   the utilization of the computing resources is lower than when        using an untrained model for calculation of parameters.

For example, after the retraining (or further training), the correlationfactor between the values of the curves of the degree of harmfulness andthe limit degree of safety becomes larger (tends toward 1).

As a result, under constant retraining of the parameter calculationmodel 721 the probability of occurrence of errors of the first andsecond kind (false positives) constantly diminishes. The use of thedifferent retraining criteria presented above ensures that the systemfor detection of a malicious file with a retrained model 721 has a veryhigh rate of decrease in the errors of the first and second kind at thestart (in the initial stages of the retraining), so that with fewretraining iterations of the parameter calculation model 721 theeffectiveness of the system for detection of a malicious file risessharply, tending toward 100%.

FIG. 8 shows an example of a method for detection of a malicious file,in accordance with exemplary aspects of the present disclosure.

A structural diagram of the method for detection of a malicious filecontains a step 810, in which a feature vector is formed, a step 820, inwhich parameters are calculated, a step 830, in which a decision ispronounced as to the detection of a malicious file, and a step 840, inwhich the parameter calculation model.

In step 810, a vector of the features characterizing the executionbehavior 711 of the file 501 is formed on the basis of the data gatheredabout said behavior, the feature vector being a convolution of thegathered data in the form of an aggregate of numbers.

In step 820, there are calculated, on the basis of the feature vector soformed and using the trained parameter calculation model 721, the degreeof harmfulness, which is a numerical value characterizing theprobability that the file 501 may prove to be malicious, and the limitdegree of safety, which is a numerical value characterizing theprobability that the file 501 will assuredly prove to be malicious whenthe aggregate of said consecutively calculated degrees are described bya predetermined time law.

Steps 810-820 are carried out for different consecutive time intervalsof execution of the file being analyzed 501, in accordance withexemplary aspects of the disclosure.

In step 830, a decision is pronounced as to the detection of a maliciousfile 501 in the event that the data gathered on the execution behavior711 of the file 501 satisfy a predetermined criterion for a finding ofharmfulness (see FIG. 9), formulated on the basis of the degree ofharmfulness and the limit degree of safety as calculated in step 820,said criterion being a rule for classification of the file in terms ofan established correlation between the degree of harmfulness and thelimit degree of safety.

In step 840, the parameter calculation model 721 is additionallyretrained on the basis of an analysis of the calculated degree ofharmfulness and limit degree of safety, as a result of which changes inthe time laws describing the degree of harmfulness and the limit degreeof safety result in the correlation between the values obtained withthose laws tending toward a maximum.

FIG. 9 shows examples of the dynamics of change in the degree ofharmfulness and the limit degree of safety as a function of the numberof behavior patterns:

In diagram 911 a situation is illustrated in which an increase in thedegree of harmfulness of the file being analyzed 501 is observed overtime (essentially with increasing of the number of behavior patternsformulated). An increase is likewise observed in the limit degree ofsafety (the general case of the criterion of harmfulness shown in FIG.3).

A decision as to the detection of a malicious file 501 is made if thedegree of the malicious file 501 begins to exceed the limit degree ofsafety of the file 501 (point 911.B).

Such a situation is observed in the event that “suspicious” activity isregistered both during the execution of the file 501 and upon analysisof the condition of the operating system as a whole. Thus, a decrease inthe probability of occurrence of an error of the first kind is achieved.Even though suspicious activity is registered in the working of thesystem (i.e., activity not yet able to be considered malicious, yet alsonot yet able to be considered safe, for example, archive packing withsubsequent deletion of the initial files), that activity is consideredwhen calculating the degree of harmfulness of the file 501, such thatthe pronouncing of a positive verdict as to the detection of a maliciousfile is not based for the most part on the suspicious activity in theworking of the system, rather than that during the execution of the file501, i.e., the contribution of the execution activity of the file 501 tothe final decision on recognizing the file 501 as malicious should begreater than the contribution of the system activity.

For example, a similar situation may be observed when a user performs anarchiving of data on the computer, resulting in a repeated reading ofdata from the hard disk and subsequent renaming or deletion of files,which might be considered suspicious activity of the working ofmalicious encryption software for the system of an ordinary user (suchas an office worker), since such activity (based on statistical dataobtained from users) is observed very seldom if at all for those users.

For example, a standard antivirus application during the analysis of theactivity of software on a user's computer may issue warnings (notundertaking any active measures) that a particular application isbehaving “suspiciously”, i.e., the behavior of that application does notconform to predetermined rules of the antivirus application. But theproposed system does not operate by predetermined rules, but insteaddynamically assesses the change in activity, resulting in the detection(pronouncing as malicious) of a malicious, but unknown file 501.

In yet another example, the change in activity upon execution of a file501 may be a consequence of the transmission of data in a computernetwork, depending on the characteristics of the data being transmitted(frequency, quantity, and so forth), which may indicate that maliciousactivity is taking place (for example, a malicious program of remoteadministration (backdoor) is running). The longer such activity goes on,the higher the chance of it being recognized as malicious, since itdiffers noticeably from typical network activity on the user's computer.

In diagram 912 a situation is depicted in which an increase in thedegree of harmfulness of the file being analyzed 501 and a decrease inthe limit degree of safety is observed over time.

The decision as to the detection of a malicious file 501 is made if thedegree of the malicious file 501 begins to exceed the limit degree ofsafety of the file 501 (point 912.B).

Such a situation is observed in the event, which is the converse of thatdescribed in diagram 911, that no “suspicious” activity is observedduring the analysis of the condition of the operating system. Thus, adecrease is achieved in the probability of occurrence of an error of thesecond kind (overlooking a malicious file). Suspicious activityinfluences “more strongly” the pronouncing of an affirmative verdict asto the detection of a malicious file if the rest of the behavior duringthe execution of the file in particular or the operating system as awhole does not look “suspicious”.

For example, such a situation may be observed during operation ofmalicious programs of remote administration on the user's computer. Themalicious activity appears only episodically, e.g. every subsequentepisode may be analyzed more “closely”. In other words, the criterionafter which the activity will be considered malicious should decreaseconstantly. But in the event that trusted applications begin beingexecuted on the user's computer, whose behavior could be consideredsuspicious, yet is not considered such on account of the applicationsbeing trusted (i.e., previously checked for harmfulness), the limitdegree of harmfulness may be increased. This will protect againstrecognizing the behavior of legitimate files as malicious and merelypostpone the detecting of malicious behavior of a malicious file.

Diagram 913 depicts a situation in which it is observed that the degreeof harmfulness of the analyzed file 501 increases over time. Theincrease does not start from the zero mark, but rather from a certaincalculated value, so that the criterion of harmfulness will be reachedsooner than in the initial case, or it will be reached whereas it wouldnot have been reached in the initial case.

The decision on the detection of a malicious file 501 is pronounced ifthe difference between the degree of the malicious file 501 and thelimit degree of safety of the file 501 becomes less than a predeterminedthreshold value (point 913.B). In a particular instance, such a decisioncan be made only if the difference between the degree of the maliciousfile 501 and the limit degree of safety of the file 501 became less thananother predetermined threshold value (point 913.A) (and this differencebetween points 913A and 913B may have increased).

For example, during the execution of a file 501 obtained from an unknownsource or formed on the computer by “suspicious” methods (such as awriting of data from the memory of a process to disk), the degree ofharmfulness may initially reveal itself to be higher than the degree ofharmfulness of files obtained by less “suspicious” methods.

In the diagram 914 a situation is illustrated which is analogous to thesituation depicted in diagram 911, with the only difference being thatthe curves describing the degree of harmfulness and the limit degree ofsafety have several successive points of intersection. In such asituation, the decision to recognize the file 501 as malicious will bemade not by the fact of the intersecting of these curves, but by thenumber of intersections exceeding a predetermined threshold value or bythe area cut out by these curves exceeding a predetermined thresholdvalue.

These diagrams will increase the effectiveness of detection of maliciousfiles and reduce the errors of the first and second kind in thedetecting of malicious files 501.

The description of the correlation between the calculated degree ofharmfulness and the calculated limit degree of safety and the decisionon pronouncing the file 501 as malicious can be expressed in thefollowing mathematical or algorithmic form:

     ω(t) > φ(t)${\sum\limits_{n}\left( {{\omega\left( t_{n} \right)} > \left( t_{n} \right)} \right)^{2}} > ɛ$

FIG. 10 shows a structural diagram of a system for classification ofobjects of a computer system, according to exemplary aspects of thedisclosure.

The structural diagram of the system for classification of objects of acomputer system consists of an object of the computer system 1001, agathering module 1010, data about the object of the computer system1011, a convolution forming module 1020, a feature vector 1021, a degreeof similarity calculating module 1030, a model for calculation ofparameters 1031, an analysis module 1040, a parameter correction module1050 and a model for formation of criteria 1051.

The gathering module 1010 is designed to gather data 1011 describing theobject of the computer system 1001 (hereinafter, the object).

In one variant aspect of the system, the computer systems are at least:

-   -   personal computers,    -   notebooks,    -   tablets,    -   smartphones,    -   controllers,    -   servers,    -   data storage means.

In yet another variant aspect of the system, the objects of the computersystem 1001 are at least:

-   -   files,    -   processes,    -   threads,    -   synchronization objects,    -   applications,    -   archives (files containing other files),    -   database records.

In yet another variant aspect of the system, the data describing theobject 1001 is at least:

-   -   data identifying the object 1001 (such as a file name or hash        computed from the file),    -   data describing the logical and functional relations between        that object 1001 and other objects 1001 (for example, which        files are contained in an archive, which threads are generated        in relation to it, and so forth),    -   data describing the difference of that object 1001 from other        objects 1001 (such as file size, type of executable file, method        of using the object, and so forth),    -   the type of the object 1001.

The data describing the object 1001 (or parameters characterizing theobject 1001) are described in further detail in FIG. 1, FIG. 2, and FIG.5 to FIG. 8.

For example, the computer system may be the personal computer of a user,and the objects of the computer system 1001 may be files. Theclassification of the files of that personal computer consists indetermining which files are malicious, and which files are safe (anantivirus scan is performed).

In yet another variant aspect, the data describing the object 1001 isgathered and analyzed in accordance with conditions established byspecified rules of gathering. As a result, before the analysis isperformed by the analysis module 1040, data may be gathered on several(often differing) states of the object 1001. This, in turn, results inincreased accuracy of the classification of the object 1001 and fewererrors of the first and second kind, which may arise during saidclassification.

In yet another variant aspect of the system, the analysis of the data onthe object 1001 may be done in parallel with the gathering of the dataon the object 1001, and thus two analysis results may be based on commondata to a certain degree. This, in turn, results in increased accuracyof the classification of the object 1001 and fewer errors of the firstand second kind, which may arise during said classification, and alsoincreased speed of performance of the classification and less use ofcomputer resources during such a classification.

In yet another variant aspect of the system, the state of the object1001 is a set of parameters and attributes (which can be identified fromthe gathered data characterizing the object 1001) at least:

-   -   clearly identifying the object 1001 among other objects,    -   clearly identifying a group of objects 1001 having identical or        similar parameters or attributes;    -   distinguishing the object 1001 from other objects with a given        degree of similarity.

In yet another variant aspect of the system, the gathering rule is atleast:

-   -   the interval of time between different states of the object 1001        satisfies a given value;    -   the interval of change in a parameter describing the state of        the object 1001 satisfies a given value;    -   the interval of change in a parameter of the computer system        resulting in a change in state of the object 1001 satisfies a        given value.

Instead of a change in state of the system in time (as stated in item 1above), it is possible to use a change in state of the system allowingfor the dynamics of change of a second selected parameter of the object1001 or the computer system. In this case, the object 1001 is analyzedin a monomerous space, where the specified parameters are independentquantities (the bases of that space), and time is a quantity dependenton those parameters.

For example, the states of the objects 1001 are determined every 100 ms(i.e., the interval of time between two states of the object 1001 is 100ms).

In yet another example, the states of the objects 1001 are determinedafter a change occurs in the size of those objects (or the volume ofdata contained in the object 1001) by 1 kB (i.e., the parameterdescribing the state of the object 1001 changes).

In yet another example, the states of the objects 1001 are determinedafter a change occurs in the volume of memory or data storage means used(for example, a hard disk) by 1 MB (i.e., a parameter of the computersystem changes).

In yet another variant aspect of the system, the gathering of data 1011about the object 1001 is done by intercepting data 1011 (about theobject 1001 or that being transmitted to the object 1001 or from theobject 1001) on the computer device with the aid of a driver or othersoftware embedded in the computer device.

For example, in order to obtain data about files 1001 on a personalcomputer, a driver is used which intercepts calls for WinAPI functionsfrom applications for working with those files 1001.

The convolution forming module 1020 forms, on the basis of the dataabout the state of the object 1001 gathered by the gathering module1010, a feature vector 1021 characterizing the state of the object 1001.

In one variant aspect of the system, the feature vector 1021 representsa convolution of collected data 1011 organized in the form of a set ofnumbers.

The formation of the feature vector 1021 is described in further detailin FIG. 1 and FIG. 2.

In yet another variant aspect of the system, the feature vector 1021contains at least one hash sum, calculated at least from the gathereddata 1011:

-   -   of given type (for example, the hash sum is calculated only from        data characterizing events related to the object 1001);    -   of given value range (for example, the hash sum is calculated        only from files with a size between 4096 kB and 10240 kB).

For example, all of the data gathered about an object 1001 which is aninstallation package containing other files (which will be on thecomputer of the user) and installation instructions (scripts, etc.) canbe divided into 2 categories: executable files and scripts and auxiliarydata. For the executable files, the hash sum MD5 is calculated; for thescripts, CRC32; and the number of those objects in each class is alsocounted. The summary hash sum is an aggregate of the computed hash sumsand the counted number of objects.

In yet another variant aspect of the system, the model for calculationof parameters 1022 was previously trained by the method of machinelearning on at least two objects 1001 belonging to different classes.

In yet another variant aspect of the system, the method of machinelearning of the model for calculation of parameters 1022 is at least:

-   -   decision tree-based gradient boosting;    -   the decision tree method;    -   the K-nearest neighbor (kNN) method;    -   the support vector machine (SVM) method.

In yet another variant aspect of the system, the trained model forcalculation of parameters 1022 is an aggregate of rules for calculatingthe degree of similarity of the object 1001 and the limit degree ofdifference of the object 1001, depending on the data determined for thedynamics of change in the state of the object 1001.

In yet another variant aspect of the system, the classes of the objectsof the computer system 1001 are at least the following classes:

-   -   safety of the objects of the computer system:    -   malicious object of the computer system;    -   suspicious object of the computer system;    -   safe object of the computer system;    -   priority of use of objects of the computer system (i.e., which        object of the computer system is to be used earlier, and how        much earlier, or which computing resources, such as memory, are        to be allocated to which object of the computer system);    -   performance of the objects of the computer system.

For example, when analyzing the files 1001 on the personal computer of auser an antivirus scan is performed, the purpose of which is aclassification of all files 1001 being analyzed into two groups:malicious files and safe files. Each file in these classes can bematched up with a certain degree of similarity (i.e., the probabilitythat the file 1001 should belong to one of the stated classes). Such anexample is described more closely in FIG. 1 to FIG. 9.

In yet another example, the classes might not be various separateentities (as in the example given above), but a single entity, yet indifferent ranges, such as: the priority of allocation of computingresources (RAM) of the objects of the computer system can be assessednumerically from 0% to 100% and can form 4 classes—1: from 0% to 25%, 2:from 26% to 50%, 3: from 51% to 75%, 4: from 76% to 100%. In the givenexample, there is an allocating of RAM from the pool of 1 GB among theobjects of the computer system; objects with minimal priority (0%) areallocated 1 MB of RAM, objects with maximum priority (100%0 areallocated 100 MB, and the other objects are allocated correspondingproportions.

The degree of similarity calculating module 1030 calculates, on thebasis of the feature vector 1021 formed by the convolution formingmodule 1020 and using a trained model for calculation of parameters, thedegree of similarity 1022, representing a numerical value characterizingthe probability that the object 1001 being classified may belong to agiven class, and the limit degree of difference, representing anumerical value characterizing the probability that the object 1001being classified will certainly belong to another specified class. Thisdegree of similarity and this limit degree of difference are independentcharacteristics describing the object 1001 stemming from differentapproaches to the classification of objects. The advantage of such anapproach is that each method of classification (or method of comparison)has its own accuracy and there always exists a probability of occurrenceof errors of the first and second kind. When several independent methodsare used, that probability is reduced in accordance with the laws ofprobability theory. Depending on the methods chosen (how much thedegrees of similarity or difference obtained by using them arecorrelated with each other), the combined probability of occurrence oferrors of the first and second kind will change (decrease). Thus,knowing the criteria of “stability” of the system, i.e., knowing themaximum level of errors acceptable for the working of the system (in thepresent case, for the classification), one can select correspondingmethods for obtaining the degrees of similarity or difference.

In one variant aspect of the system, if in the course of the perioddefined by the specified rule of collection at least two degrees ofsimilarity and limit degrees of difference have been calculated, theaggregate of consecutively calculated degrees of similarity and limitdegrees of difference is described by a predetermined time law.

In yet another variant aspect of the system, several degrees ofsimilarity and limit degrees of difference are calculated for one object1001, on the basis of data on at least two states of that object 1001.

In yet another variant aspect of the system, the data on the state ofthe object 1001 includes at least:

-   -   the actions being executed on the object 1001 by the computer        system;    -   the actions being executed by the object 1001 on the computer        system;    -   the parameters of the computer system whose change results in a        change in the state of the object 1001;    -   static parameters of the object 1001 (parameters of the object        1001 not changed upon a change in the state of the object 1001,        such as the size of a file kept in an archive or the name of an        executable file).

For example, if the object 1001 is an executable file, the commandsbeing executed by that executable file on the operating system may becalls for WinAPI functions.

In yet another example, if the object 1001 is a record in a database,the command executed by the means of working with databases on thatrecord may be SQL query commands.

In yet another variant aspect of the system, at least the degree ofsimilarity or the limit degree of difference being calculated depend onthe degree of similarity and accordingly the limit degree of differencecalculated at least:

-   -   at the instant of creating the object 1001;    -   at the instant of the first change in state of the object 1001;    -   on the basis of an analysis of the static parameters of the        object 1001.

For example, if at the start of the execution of a file the degree ofsimilarity to the class of malicious objects for the file is 0.0, but astime passes it rises to 0.4, the degree of similarity to a maliciousobject for the file created by that file is designated as 0.4 already atthe instant of its creation, and it increases in the process of itsworking. This process is described in further detail in FIG. 7 to FIG.9.

In yet another variant aspect of the system, the time laws describingthe aggregate of consecutively calculated degrees of similarity and theaggregate of the consecutively calculated limit degrees of differenceare monotonic in nature.

For example, the change in the degree of similarity (or the degree ofharmfulness, in an analysis of a file for harmfulness) of the file 1001being analyzed can only increase, while the limit degree of difference(the limit degree of safety in an analysis of a file for harmfulness)can only decrease. Thus, sooner or later the analyzed file will berecognized as malicious, once the sum total of its “suspicious actions”exceeds the established limit.

In yet another variant aspect of the system, the degree of similarity isdetermined at least:

-   -   using the Hirchberg algorithm;    -   by the Damerau-Levenshtein distance;    -   by the Jensen-Shannon distance;    -   by the Hamming distance;    -   using the Jaro-Winkler similarity algorithm.

For example, the above indicated methods of determining the degree ofsimilarity may be used depending on which objects 1001 are beinganalyzed. If the objects 1001 are text files, one will use the Hirchbergalgorithm, if they are lexemes, the Hamming distance.

The analysis module 1040 is designed to pronounce a decision as towhether the object 1001 belongs to a given class, in the event that thedata on the state of the object 1001 gathered up to the time ofactuation of the given gathering rule satisfies the given criterion fordetermining the class. The criteria is formulated on the basis of thedegree of similarity and the limit degree of difference as calculated bythe degree of similarity calculating module 1030. The criterion is therule for classification of the object by the correlation establishedbetween the degree of similarity and the limit degree of difference.

In one variant aspect of the system, the analysis module 1040 beginsworking after data characterizing the object 1001 has been gathered andprocessed with the aid of the gathering module 1010, the convolutionforming module 1020 and the degree of similarity calculating module1030. This fact is determined with the aid of the data gathering rule(i.e., the rule of when to halt the gathering of data on the object 1001and commence the analysis of that data).

The analysis is described in further detail in FIG. 7 to FIG. 9.

In one variant aspect of the system, the correlation between the degreeof similarity and the limit degree of difference is at least:

-   -   the difference from a predetermined threshold value of the        distance between the degree of similarity and the limit degree        of difference;    -   the difference from a predetermined threshold value of the area        bounded in a given time interval between the degree of        similarity and the limit degree of difference;    -   the difference from a predetermined threshold value of the rate        of mutual growth of the curve describing the change in the        degree of harmfulness and the limit degree of difference.

The correlations are described in further detail in FIG. 7 to FIG. 9.

The parameter correction module 1050 is designed to retrain the modelfor calculation of parameters 1022 on the basis of an analysis of thecalculated degree of similarity and the calculated limit degree ofdifference, as a result of which changes in the time laws describing thedegree of similarity and the limit degree of difference will result inthe correlation between the values obtained on the basis of those lawstending toward a maximum.

In one variant aspect of the system, the model for calculation ofparameters 1022 is retrained so that, when that model 1022 is used, acriterion formed afterwards will ensure at least:

that the accuracy of determining the degree of similarity and the limitdegree of difference is greater than when using an untrained model forcalculation of parameters 1022;

the utilization of the computing resources is lower than when using anuntrained model for calculation of parameters.

The technology for the machine learning is described in further detailin FIG. 1, FIG. 2, FIG. 5, and FIG. 6. Even though the teaching of themodel for calculation of parameters 1022 has been described above forthe classification of objects of a computer system 1001, while thefigures show models for detection of malicious files, these technologiesare algorithmically similar, and the detection of malicious files is aparticular instance of the model for calculation of parameters, since inthis case there is a classification of files by two classes: “safefiles” and “malicious files”.

FIG. 11 shows a structural diagram of a method for classification ofobjects of a computer system.

The structural diagram of the method for classification of objects of acomputer system contains a step 1110 in which data is gathered about anobject of the computer system, a step 1120 in which a feature vector isformed, a step 1130 in which degrees of similarity are calculated, astep 1140 in which the object of the computer system is classified, anda step 1150 in which a model for calculation of parameters is retrained.

In step 1110 data 1011 describing the state of the object of thecomputer system 1001 (hereafter, the object) is gathered.

In step 1120, on the basis of the data 1011 gathered about the states ofthe object 1001, a feature vector 1021 is formed, characterizing thestate of the object 1001.

In step 1130, on the basis of the feature vector 1021 formed and using atrained model for calculation of parameters 1022, there is calculated adegree of similarity, representing a numerical value characterizing theprobability that the object 1001 being classified may belong to a givenclass, and a limit degree of difference, representing a numerical valuecharacterizing the probability that the object 1001 being classifiedwill certainly belong to another specified class.

In step 1140, a decision is pronounced that the object 1001 belongs tothe specified class if the data 1011 on the state of the object 1001that was collected during a period of time as defined by a specifiedrule for the collection in steps 1110-1130 satisfies a specifiedcriterion for determination of the class, formed on the basis of thedegree of similarity and the limit degree of difference calculated inthe previous step, said criterion being a rule for the classification ofthe object 1001 according to an established correlation between thedegree of similarity and the limit degree of difference.

In step 1150 the model for calculation of parameters 1022 is retrainedon the basis of the analysis of the calculated degree of similarity andthe calculated limit degree of difference, as a result of which changesin the time laws describing the degree of similarity and the limitdegree of difference will cause the correlation between the valuesobtained on the basis of those laws to tend toward a maximum.

FIG. 12 is a block diagram illustrating a computer system 20 on whichaspects of systems and methods of classification of objects of acomputer system may be implemented in accordance with an exemplaryaspect. It should be noted that the computer system 20 can correspond toany components of the system 100 described earlier. The computer system20 can be in the form of multiple computing devices, or in the form of asingle computing device, for example, a desktop computer, a notebookcomputer, a laptop computer, a mobile computing device, a smart phone, atablet computer, a server, a mainframe, an embedded device, and otherforms of computing devices.

As shown, the computer system 20 includes a central processing unit(CPU) 21, a system memory 22, and a system bus 23 connecting the varioussystem components, including the memory associated with the centralprocessing unit 21. The system bus 23 may comprise a bus memory or busmemory controller, a peripheral bus, and a local bus that is able tointeract with any other bus architecture. Examples of the buses mayinclude PCI, ISA, PCI-Express, HyperTransport™, InfiniBand™, Serial ATA,I²C, and other suitable interconnects. The central processing unit 21(also referred to as a processor) can include a single or multiple setsof processors having single or multiple cores. The processor 21 mayexecute one or more computer-executable codes implementing thetechniques of the present disclosure. The system memory 22 may be anymemory for storing data used herein and/or computer programs that areexecutable by the processor 21. The system memory 22 may includevolatile memory such as a random access memory (RAM) 25 and non-volatilememory such as a read only memory (ROM) 24, flash memory, etc., or anycombination thereof. The basic input/output system (BIOS) 26 may storethe basic procedures for transfer of information between elements of thecomputer system 20, such as those at the time of loading the operatingsystem with the use of the ROM 24.

The computer system 20 may include one or more storage devices such asone or more removable storage devices 27, one or more non-removablestorage devices 28, or a combination thereof. The one or more removablestorage devices 27 and non-removable storage devices 28 are connected tothe system bus 23 via a storage interface 32. In an aspect, the storagedevices and the corresponding computer-readable storage media arepower-independent modules for the storage of computer instructions, datastructures, program modules, and other data of the computer system 20.The system memory 22, removable storage devices 27, and non-removablestorage devices 28 may use a variety of computer-readable storage media.Examples of computer-readable storage media include machine memory suchas cache, SRAM, DRAM, zero capacitor RAM, twin transistor RAM, eDRAM,EDO RAM, DDR RAM, EEPROM, NRAM, RRAM, SONOS, PRAM; flash memory or othermemory technology such as in solid state drives (SSDs) or flash drives;magnetic cassettes, magnetic tape, and magnetic disk storage such as inhard disk drives or floppy disks; optical storage such as in compactdisks (CD-ROM) or digital versatile disks (DVDs); and any other mediumwhich may be used to store the desired data and which can be accessed bythe computer system 20.

The system memory 22, removable storage devices 27, and non-removablestorage devices 28 of the computer system 20 may be used to store anoperating system 35, additional program applications 37, other programmodules 38, and program data 39. The computer system 20 may include aperipheral interface 46 for communicating data from input devices 40,such as a keyboard, mouse, stylus, game controller, voice input device,touch input device, or other peripheral devices, such as a printer orscanner via one or more I/O ports, such as a serial port, a parallelport, a universal serial bus (USB), or other peripheral interface. Adisplay device 47 such as one or more monitors, projectors, orintegrated display, may also be connected to the system bus 23 across anoutput interface 48, such as a video adapter. In addition to the displaydevices 47, the computer system 20 may be equipped with other peripheraloutput devices (not shown), such as loudspeakers and other audiovisualdevices

The computer system 20 may operate in a network environment, using anetwork connection to one or more remote computers 49. The remotecomputer (or computers) 49 may be local computer workstations or serverscomprising most or all of the aforementioned elements in describing thenature of a computer system 20. Other devices may also be present in thecomputer network, such as, but not limited to, routers, networkstations, peer devices or other network nodes. The computer system 20may include one or more network interfaces 51 or network adapters forcommunicating with the remote computers 49 via one or more networks suchas a local-area computer network (LAN) 50, a wide-area computer network(WAN), an intranet, and the Internet. Examples of the network interface51 may include an Ethernet interface, a Frame Relay interface, SONETinterface, and wireless interfaces.

Aspects of the present disclosure may be a system, a method, and/or acomputer program product. The computer program product may include acomputer readable storage medium (or media) having computer readableprogram instructions thereon for causing a processor to carry outaspects of the present disclosure.

The computer readable storage medium can be a tangible device that canretain and store program code in the form of instructions or datastructures that can be accessed by a processor of a computing device,such as the computer system 20. The computer readable storage medium maybe an electronic storage device, a magnetic storage device, an opticalstorage device, an electromagnetic storage device, a semiconductorstorage device, or any suitable combination thereof. By way of example,such computer-readable storage medium can comprise a random accessmemory (RAM), a read-only memory (ROM), EEPROM, a portable compact discread-only memory (CD-ROM), a digital versatile disk (DVD), flash memory,a hard disk, a portable computer diskette, a memory stick, a floppydisk, or even a mechanically encoded device such as punch-cards orraised structures in a groove having instructions recorded thereon. Asused herein, a computer readable storage medium is not to be construedas being transitory signals per se, such as radio waves or other freelypropagating electromagnetic waves, electromagnetic waves propagatingthrough a waveguide or transmission media, or electrical signalstransmitted through a wire.

Computer readable program instructions described herein can bedownloaded to respective computing devices from a computer readablestorage medium or to an external computer or external storage device viaa network, for example, the Internet, a local area network, a wide areanetwork and/or a wireless network. The network may comprise coppertransmission cables, optical transmission fibers, wireless transmission,routers, firewalls, switches, gateway computers and/or edge servers. Anetwork interface in each computing device receives computer readableprogram instructions from the network and forwards the computer readableprogram instructions for storage in a computer readable storage mediumwithin the respective computing device.

Computer readable program instructions for carrying out operations ofthe present disclosure may be assembly instructions,instruction-set-architecture (ISA) instructions, machine instructions,machine dependent instructions, microcode, firmware instructions,state-setting data, or either source code or object code written in anycombination of one or more programming languages, including an objectoriented programming language, and conventional procedural programminglanguages. The computer readable program instructions may executeentirely on the user's computer, partly on the user's computer, as astand-alone software package, partly on the user's computer and partlyon a remote computer or entirely on the remote computer or server. Inthe latter scenario, the remote computer may be connected to the user'scomputer through any type of network, including a LAN or WAN, or theconnection may be made to an external computer (for example, through theInternet). In some aspects, electronic circuitry including, for example,programmable logic circuitry, field-programmable gate arrays (FPGA), orprogrammable logic arrays (PLA) may execute the computer readableprogram instructions by utilizing state information of the computerreadable program instructions to personalize the electronic circuitry,in order to perform aspects of the present disclosure.

In various aspects, the systems and methods described in the presentdisclosure can be addressed in terms of modules. The term “module” asused herein refers to a real-world device, component, or arrangement ofcomponents implemented using hardware, such as by an applicationspecific integrated circuit (ASIC) or FPGA, for example, or as acombination of hardware and software, such as by a microprocessor systemand a set of instructions to implement the module's functionality, which(while being executed) transform the microprocessor system into aspecial-purpose device. A module may also be implemented as acombination of the two, with certain functions facilitated by hardwarealone, and other functions facilitated by a combination of hardware andsoftware. In certain implementations, at least a portion, and in somecases, all, of a module may be executed on the processor of a computersystem (such as the one described in greater detail in FIG. 12, above).Accordingly, each module may be realized in a variety of suitableconfigurations, and should not be limited to any particularimplementation exemplified herein.

In the interest of clarity, not all of the routine features of theaspects are disclosed herein. It would be appreciated that in thedevelopment of any actual implementation of the present disclosure,numerous implementation-specific decisions must be made in order toachieve the developer's specific goals, and these specific goals willvary for different implementations and different developers. It isunderstood that such a development effort might be complex andtime-consuming, but would nevertheless be a routine undertaking ofengineering for those of ordinary skill in the art, having the benefitof this disclosure.

Furthermore, it is to be understood that the phraseology or terminologyused herein is for the purpose of description and not of restriction,such that the terminology or phraseology of the present specification isto be interpreted by the skilled in the art in light of the teachingsand guidance presented herein, in combination with the knowledge of theskilled in the relevant art(s). Moreover, it is not intended for anyterm in the specification or claims to be ascribed an uncommon orspecial meaning unless explicitly set forth as such.

The various aspects disclosed herein encompass present and future knownequivalents to the known modules referred to herein by way ofillustration. Moreover, while aspects and applications have been shownand described, it would be apparent to those skilled in the art havingthe benefit of this disclosure that many more modifications thanmentioned above are possible without departing from the inventiveconcepts disclosed herein.

What is claimed is:
 1. A method for detecting malicious objects on acomputer system, comprising: collecting data describing a state of anobject of the computer system; forming a vector of featurescharacterizing the state of the object; forming a hash function of thevector of features, so that a degree of similarity of the vector offeatures and an inverse hash function of the result of the formed hashfunction of the vector of features will have a degree of similaritygreater than a specified value; calculating the degree of similaritybased on the formed vector of features using a trained detection model,wherein the degree of similarity is a numerical value characterizing theprobability that the object being classified belongs to a given classand wherein the detection model is trained based on the hash functionsof the vector of features; calculating a limit degree of difference,using the trained detection model, wherein the limit degree ofdifference is a numerical value characterizing the probability that theobject being classified belongs to another class, wherein the limitdegree of difference is calculated based on analysis of staticparameters of the object and depends on the degree of similarity;forming a criterion for determination of class of the object based onthe degree of similarity and the limit degree of difference, wherein thecriterion is a rule for classification of the object by an establishedcorrelation between the degree of similarity and the limit degree ofdifference and wherein the correlation between the degree of similarityand the limit degree of difference is one or more of: a difference indistance between the degree of similarity and the limit degree ofdifference from a predetermined threshold value; a difference in thearea bounded in a given time interval between the degree of similarityand the limit degree of difference from a predetermined threshold value;and a difference in the rate of mutual growth of the curve describingthe change in the degree of harmfulness and the limit degree ofdifference from a predetermined value; determining that the objectbelongs to the determined class when the data satisfies the criterion,wherein the data is collected over a period of time defined by a datacollection rule, wherein the data collection rule is one of: an intervalof time between different states of the object satisfies a predeterminedvalue, and a change in a parameter of the computer system resulting in achange in state of the object satisfies a given value; and pronouncingthe object as malicious when it is determined that the object belongs tothe specified class.
 2. The method of claim 1, wherein the vector offeatures is a convolution of collected data organized in the form of aset of numbers.
 3. The method of claim 1, wherein the limit degree ofdifference is calculated one of: at the instant of creating the object,at the instant of a first change in state of the object.
 4. The methodof claim 1, wherein if in the course of the period defined by the datacollection rule at least two degrees of similarity and limit degrees ofdifference have been calculated, a set of consecutively calculateddegrees of similarity and limit degrees of difference is described by apredetermined time law.
 5. The method of claim 4, wherein the time lawsdescribing the consecutively calculated degrees of similarity and theconsecutively calculated limit degrees of difference are monotonic.
 6. Asystem for detecting malicious objects on a computer system, comprising:a hardware processor configured to: collect data describing a state ofan object of the computer system; form a vector of featurescharacterizing the state of the object; form a hash function of thevector of features, so that a degree of similarity of the vector offeatures and an inverse hash function of the result of the formed hashfunction of the vector of features will have a degree of similaritygreater than a specified value; calculate the degree of similarity basedon the formed vector of features using a trained detection model,wherein the degree of similarity is a numerical value characterizing theprobability that the object being classified belongs to a given classand wherein the detection model is trained based on the hash functionsof the vector of features; calculate a limit degree of difference, usingthe trained detection model, wherein the limit degree of difference is anumerical value characterizing the probability that the object beingclassified belongs to another class, wherein the limit degree ofdifference is calculated based on analysis of static parameters of theobject and depends on the degree of similarity; form a criterion fordetermination of class of the object based on the degree of similarityand the limit degree of difference, wherein the criterion is a rule forclassification of the object by an established correlation between thedegree of similarity and the limit degree of difference and wherein thecorrelation between the degree of similarity and the limit degree ofdifference is one or more of: a difference in distance between thedegree of similarity and the limit degree of difference from apredetermined threshold value; a difference in the area bounded in agiven time interval between the degree of similarity and the limitdegree of difference from a predetermined threshold value; and adifference in the rate of mutual growth of the curve describing thechange in the degree of harmfulness and the limit degree of differencefrom a predetermined value; determine that the object belongs to thedetermined class when the data satisfies the criterion, wherein the datais collected over a period of time defined by a data collection rule,wherein the data collection rule is one of: an interval of time betweendifferent states of the object satisfies a predetermined value, and achange in a parameter of the computer system resulting in a change instate of the object satisfies a given value; and pronounce the object asmalicious when it is determined that the object belongs to the specifiedclass.
 7. The system of claim 6, wherein the vector of features is aconvolution of collected data organized in the form of a set of numbers.8. The system of claim 6, wherein the limit degree of difference iscalculated one of: at the instant of creating the object, at the instantof a first change in state of the object.
 9. The system of claim 6,wherein if in the course of the period defined by the data collectionrule at least two degrees of similarity and limit degrees of differencehave been calculated, a set of consecutively calculated degrees ofsimilarity and limit degrees of difference is described by apredetermined time law.
 10. The system of claim 9, wherein the time lawsdescribing the consecutively calculated degrees of similarity and theconsecutively calculated limit degrees of difference are monotonic. 11.A non-transitory computer-readable medium, storing instructions thereonfor detecting malicious objects on a computer system, the instructionscomprising: collecting data describing a state of an object of thecomputer system; forming a vector of features characterizing the stateof the object; forming a hash function of the vector of features, sothat a degree of similarity of the vector of features and an inversehash function of the result of the formed hash function of the vector offeatures will have a degree of similarity greater than a specifiedvalue; calculating the degree of similarity based on the formed vectorof features using a trained detection model, wherein the degree ofsimilarity is a numerical value characterizing the probability that theobject being classified belongs to a given class and wherein thedetection model is trained based on the hash functions of the vector offeatures; calculating a limit degree of difference, using the traineddetection model, wherein the limit degree of difference is a numericalvalue characterizing the probability that the object being classifiedbelongs to another class, wherein the limit degree of difference iscalculated based on analysis of static parameters of the object anddepends on the degree of similarity; forming a criterion fordetermination of class of the object based on the degree of similarityand the limit degree of difference, wherein the criterion is a rule forclassification of the object by an established correlation between thedegree of similarity and the limit degree of difference and wherein thecorrelation between the degree of similarity and the limit degree ofdifference is one or more of: a difference in distance between thedegree of similarity and the limit degree of difference from apredetermined threshold value; a difference in the area bounded in agiven time interval between the degree of similarity and the limitdegree of difference from a predetermined threshold value; and adifference in the rate of mutual growth of the curve describing thechange in the degree of harmfulness and the limit degree of differencefrom a predetermined value; determining that the object belongs to thedetermined class when the data satisfies the criterion, wherein the datais collected over a period of time defined by a data collection rule,wherein the data collection rule is one of: an interval of time betweendifferent states of the object satisfies a predetermined value, and achange in a parameter of the computer system resulting in a change instate of the object satisfies a given value; and pronouncing the objectas malicious when it is determined that the object belongs to thespecified class.
 12. The medium of claim 11, wherein the vector offeatures is a convolution of collected data organized in the form of aset of numbers.